# Arlen Cox

Ramblings on my hobbies

## Statistical machine translation fails

I know that statistical machine translation (the SMT that does not mean Satisfiability Modulo Theories) is not about being correct all of the time, but sometimes it is really sensitive to odd things that should have nothing to do with the translation.  For example, I am learning French and I wanted to offer help to someone (in French).  So I thought the sentence:

Would you like me to help?

This is a completely natural sentence, but the grammatical structure is kind of complicated.  Instead of “me” being a direct object to “like”, it is part of a compound object “me to help.”  The great news is that Google Translate seemingly correctly translates this to French:

Voulez-vous que je vous aide?

Ok, it’s not conversational French per-se, but it is adequate.  Literally translated, it reads

This seems perfectly adequate.  However, what if you are typing this into your phone and you are particularly lazy.  For example, if you leave off the capitalization and the punctuation, you get instead:

would you like me to help→ vous me voulez aider

This is a rather odd translation.  Now “me” is the direct object of want, so this translates to

you want me…to help

That’s kind of annoying.  How about the other cases adding the question mark or the capitalization, but not both:

would you like me to help? → voulez-vous que je aider?

This is avery odd translation.  Maybe there is some context where this is correct grammar, but my reading of this is essentially

do you want that I to help?

The other case is also kind of wrong:

Would you like me to help → Voulez-vous que pour aider

This also an odd translation.  I’m not familiar with the grammatical structure, so I can’t conclude that it is wrong, but I think that it probably is.  The “que pour” structure doesn’t seem very common.  I believe that this roughly translates to:

Do you want in order to help

The lack of the second party (me) in the sentence is a bit odd.

The take away from this is that Google Translate is very sensitive to punctuation and capitalization.  It can completely change your sentence if you leave these out.  Of course there are times when it give the wrong translation when you add them, so there is no universally correct answer.  The only true answer is to study the language…

Written by arlencox

August 25, 2014 at 10:21 am

Posted in Uncategorized

## Location tracking in OCaml parsers

I guess that I have been so assimilated into the functional programming world that I was having a very basic issue when developing parsers in OCaml.  I had this issue with all of the parser generators (ocamlyacc, Menhir, Dypgen and Aurochs): I could never get a simple error location when there was a parse error.  They all raise an exception called something like Parse_error, but they give no location information with that exception.

I found this unbelievable.  I mean Antlr gives location information and Parsec does too.  What makes OCaml so obtuse?

As it turns out, OCaml is not obtuse; I am.  The lexers generated by these tools are stateful. Therefore, when you get a parse error, you can simply query the lexer for the current location after exception has been raised.  For example, using Dypgen:

let lexbuf = Dyp.from_channel (StrParse.pp ()) stdin in
try
(* do parse *)
with Dyp.Syntax_error ->
let (startpos,endpos) = Dyp.dyplex_lexbuf_position lexbuf in
(* print error message  - startpos and endpos are of standard Lexing.position type *)


This of course doesn’t provide error recovery where multiple errors could be provided and it doesn’t provide intelligent error messages, but it does give a little bit of information at least on the error. This is what many research tools provide, since developing good error handling is a time consuming task for code that will be quickly disposed of.

A very similar process can be followed both ocamlyacc and Menhir to good effect, though now that I have discovered the power of Dypgen, I doubt I will go back.

Written by arlencox

February 22, 2013 at 2:16 am

Posted in Uncategorized

## Unofficial Laws

As an undergraduate I had a friend named Chad who invented his own named “constant” that he called Chad’s “Constant”.  This particular constant is of use to any engineering or physics student where a numerical answer to a problem is desired.  Chad’s “Constant” is defined as the correct answer to a problem minus my answer to the problem.  Consider this example:

$P=A$

Problem $P$ is equal to some correct answer $A$.  However, I compute (possibly incorrectly) that:

$P=B$

To ensure that I get full credit for my assignment, I can always add Chad’s “Constant”, $K$ in order to get the right answer:

$B + K = A$

## Arlen’s Law of Performance

If I add this as the last step to any problem, I can never fail.

In a similar vein, I will define Arlen’s Law of Performance, which applies to software development.  It is this:

If there is any doubt that performance will suck, performance will suck.

It’s not quite Amdahl’s Law as there is no mathematics backing this up.  This is purely the observation that probabilistically speaking you are more likely to have bad performance than good.  If heuristics are used to accelerate a process there is always a care where they fail and that case will likely occur often.  SAT is an exception to this rule, but it has been carefully studied and a tremendous amount of effort has been put into making performance not suck most of the time.

Written by arlencox

September 7, 2012 at 5:48 pm

Posted in Uncategorized

## Color Spaces and the Web

Gack! I am working on creating a new website for myself at school.  Since I do a lot of photography these days, I thought I would put up a page about my photography.  Naturally I had my normal problems with HTML and CSS.  Nothing is getting centered correctly.  Alignment is odd and text never flows the way I want it to.  Oh well.  What I didn’t expect is to have problems with my photographs.

As I was proofreading my page, I realized that what I was saying about some of my photographs didn’t really correspond with the photographs.  I was saying that I really liked the color in some of the pictures and when I looked at the picture, I didn’t like the color.  In fact it looked nothing like I remembered.  I opened up iPhoto and looked at the original.  It looked great.  So what went wrong?

It turns out color spaces matter.  Check out this picture:

This is the exact same picture shown in two different web browsers.  The one on the left is shown in Google Chrome (my new web browser of choice).  The on on the right is shown in Apple Safari (my previous web browser of choice).  Whether you like it or not, I intended for the picture to look like the one on the right.  So why did this happen?

It turns out that the picture has an embedded color profile.  This tells the computer displaying the picture what the color values actually mean.  In this case, the picture uses the Adobe RGB space, which is larger space than Standard RGB (sRGB).  When a browser that ignores color profiles renders the picture, it interprets the colors specified in Adobe RGB as sRGB colors.  This results in the muted colors seen in the left picture.

There are several solutions to this problem:

1. If you are shooting with a point and shoot camera, you probably won’t have this problem.  Most cameras use the sRGB color space by default and photo editing software will respect this.
2. If you are shooting with a DSLR, you could always shoot in JPEG mode.  This will give you the same benefits as a point and shoot.
3. If you’re like me and want the extra dynamic range offered by RAW you could try editing using programs other than iPhoto.  I know Aperture can be set to export to sRGB regardless of what color space you edit it.
4. If you you’re even more like me and you don’t want to buy software, you could use ColorSync, which comes with every Mac to change the color space of your picture.  It’s very easy to set up an automator action to convert any JPEG image from Adobe RGB to sRGB.  It’s a quite fast operation.

The real problem is for people viewing my pictures.  I’m not particularly eager to go back and update all of my pictures to fix the color profile.  So which browsers can actually view my pictures correctly?

 Firefox 3 (Win/Mac/Linux) Works! (Supports ICC v2 Profiles) Firefox 4 (Win/Mac/Linux) Works! (Supports ICC v2 Profiles) Chrome 10 (Win/Mac/Linux) Does not work! Safari 5 (Win/Mac) Works! (Supports ICC v2 and v4 Profiles) Internet Explorer 7 (Win) Does not work! Internet Exporer 9 (Win) Works (sort of) (Supports ICC v2.  v4 seems broken)

I recommend that if you’re using chrome to view my pictures you try looking at them again in a supported browser. Use this website to test your browser for compatibility.

Written by arlencox

March 20, 2011 at 8:43 pm

Posted in Uncategorized

## Building Your Own Synthesizer Part 1 – Getting Started

This is the start of a series of posts developing a MIDI (Musical Instrument Digital Interface) synthesizer.  For the sake of simplicity, we will start very simple and go from there.  The first synthesizer we will build will be a breath sensitive sine wave.  We will then develop that into a phase modulation synthesizer.

## Some Tools

Rather than developing a plugin that works with just VST or AU hosts, we are going to develop a standalone synthesizer.  Once you have a standalone synth that takes MIDI as input and produces samples, converting it to work as VST or AU should just be a matter of creating the wrappers.  We will discuss this conversion in a later entry in the series.

Since the goal is to make something useful across multiple platforms, we need a way to do real-time audio and MIDI in a cross platform way.  For this we turn to RtAudio and RtMidi.  These are two (compatible) libraries developed by Gary Scavone, who is a one of the original waveguide-based physical modelling researchers.

## Diving Into The Code

We will go over the code starting from the set up of MIDI and audio support. We will then go over the sound generation. Finally we will develop the MIDI control.

### Setting up and probing MIDI

#include<cstdlib>
#include<iostream>
#include "RtMidi.h"
#include "RtAudio.h"

using namespace std;

...

int main(int argc, char** argv)
{
RtMidiIn *midiin;
int nPorts;
// other declarations
...
try {
midiin = new RtMidiIn();
} catch ( RtError &error ) {
error.printMessage();
exit( EXIT_FAILURE );
}
nPorts = midiin->getPortCount();

if(argc != 2) {
// list inputs

string portName;
for(int i = 0; i < nPorts; i++) {
try {
portName = midiin->getPortName(i);
} catch ( RtError &error ) {
error.printMessage();
goto cleanup;
}
cout << "  Input port #d " << i+1 << ": " << portName << endl;
}
goto cleanup;
}

//open specified port
port = atoi(argv[1]) - 1;
if(port >= nPorts) {
cerr << "Invalid port number: " << port << endl;
goto cleanup;
}

midiin->openPort(port);
midiin->setCallback(&midicallback);
midiin->ignoreTypes(false, false, false);

// audio set up
...

cleanup:
delete midiin;
return 0;
}


The code starts by declaring a pointer for the midi input. We then initialize it and get the number of connected ports. If a port was not specified on the command line, we print out all of the ports and their corresponding names and then exit. getPortCount() gets the number of MIDI ports currently available on the computer. We can then use getPortName() to get the string representation of each port.

To open the selected port, we call openPort with the appropriate port number and then specify the callback routine that is responsible for handling MIDI messages. We also specify that no MIDI messages should be ignored.

In the case of an error, we want to delete the memory allocated for midiin, so we jump to the cleanup section and free the allocated memory. This is poor style and we will enhance it in a future version. This would be better handled with auto_ptr or by simply using static allocation.

### Setting up Audio

  unsigned sampleRate;
unsigned bufferFrames;
RtAudio dac;
RtAudio::StreamParameters parameters;


We then fill in the audio setup region:

  parameters.deviceId = dac.getDefaultOutputDevice();
parameters.nChannels = 2;
parameters.firstChannel = 0;

sampleRate = 44100;
bufferFrames = 128;

try {
dac.openStream(&parameters, NULL, RTAUDIO_FLOAT64, sampleRate, &bufferFrames, &sine_gen, NULL);
dac.startStream();
} catch(RtError& e) {
e.printMessage();
goto cleanup;
}


We get an output port and load it into the parameters structure. We also select that we are going to do stereo output and we select the zeroth index on the device. We also select a sample rate of 44100 Hz, which is compatible with most output devices and a buffer of 128 samples, which will produce an output delay of 128/44100 = ~3ms + MIDI delay + hardware buffering delay. This should keep the delay low enough that it is not too noticeable.

We then open the output stream and pass a callback function in that will be responsible for filling the buffer. If there is an error, we catch it, clean up and exit.

### MIDI processing

The MIDI processing routine updates global variables that dictate the sound that will be generated. Therefore, we introduce some global variables:

double phase;
double deltaphase;
double delta;
double amplitude;


The phase variable stores the current phase of the waveform that is being played. The deltaphase variable dicates how much change in phase there will be per sample. The delta variable dictates how much time passes per sample and amplitude is the volume parameter for the waveform.

Using phases instead of time keeps the sound pop-free. If you output sin(2*pi*f*t), you’ll get the correct frequency, but when you change the f value, the output sample may have been high and now it will be low, thus causing a very short, high frequency sound, which is a pop. Using the phase avoids this the frequency changes the rate at which we update the phase, rather than the actual f value.

unsigned breath;
unsigned note;

void midicallback(double deltatime, vector<unsigned char> *message, void *userData) {
//unsigned nBytes = message->size();
/*for(unsigned i = 0; i < nBytes; i++)
std::cout << "Byte " << i << " = " << (int)message->at(i) << ", ";
if(nBytes > 0)
std::cout << "stamp = " << deltatime << std::endl;*/

switch((*message)[0] & 0xF0) {
case 0x80: // note off
//amplitude = 0.0;
cout << "Note off: " << (unsigned)((*message)[1]) << endl;
break;
case 0x90: // note on
{
if((*message)[2] == 0)
break;
(*message)[1] -= 12;
float freq = midiToFreq[(*message)[1]];
deltaphase = 2.0*M_PI*(double)freq*delta;
breath = (*message)[2];
amplitude = (double)(*message)[2]/127.0;
note = (*message)[1];
cout << "Note on: " << (unsigned)((*message)[1]) << " " << (double)freq << " " << (unsigned)((*message)[2]) << endl;
}
break;
case 0xB0: // controller change
switch((*message)[1]) {
case 2: // breath controller
if(breath == 0)
break;
breath = (*message)[2];
cout << "Breath: " << (unsigned)((*message)[2]) << endl;
amplitude = (double)(*message)[2]/127.0;

break;
default:
break;
}
break;
case 0xE0: // pitch bend
{
int bend_amount = ((*message)[2] << 7) + (*message)[1];
bend_amount -= 8192;
cout << "Pitch Bend: " << bend_amount << endl;
float bend = (float)bend_amount/8192.0/8.0+1.0;
float freq = midiToFreq[note]*bend;
deltaphase = 2.0*M_PI*(double)freq*delta;
}
break;
default: // throw it away
break;
}
}


Now we generate the appropriate parameters from the MIDI messages. We ignore the channel for our prototype as it does not actually make a difference and could just complicate using the synth. We do this by bitwise anding the first byte of the message by 0xF0. The lower four bits of the first byte are the channel, so this will force all channels to be channel zero.

Based on the value of the first byte we know mostly what the message is. A 0x80 message indicates note off. A 0x90 message indicates note on, a 0xB0 message indicates controller change (which we use for breath pressure) and 0xE0 indicates pitch bend.

One of the catches with this is that a note on message can also indicate note off. If the velocity is zero, this means note off. Additionally, legato is created by sending the next note on before the previous note off. If you’re not careful to handle this you may get some strange behavior. We handle it by checking to see if the velocity field (byte 2) is 0 before handling note on. If it is zero, we simply break out and don’t do anything. As the note start and stops are dictated by the breath pressure rather than the note on and off messages, we don’t have to properly handle note off.

We convert the midi note value (byte 1 of a note on message) into a frequency using a look up table that can be generated by the following python program:

a = 440.0

s = a/(2.**(69./12))
def fofm(m):
return s*2**(m/12.)

for x in xrange(128):
print "%f," % (fofm(x))


It generates a mapping of MIDI notes to their corresponding frequencies. We use the frequency to compute the deltaphase value and an initial breath pressure value. This is to work around a bug in the controller where it generates breath pressure messages before note on messages. Therefore, we ignore all messages after a pressure of 0 is reached until a note on is received. We must set the pressure at this point or we won’t get any sound until the next breath pressure message is sent, which may never happen.

The rest of this handling is in the controller change code. If the breath pressure value was already 0, we ignore all breath controller changes until the breath pressure is already non-zero. It is set to a non-zero value by the note on code.

Pitch bend simply multiplies the frequency by the value received from the pitch bend message. The value is received as [pitch bend, lsb, msb]. We join the msb and lsb into a single 14 bit number and subtract 8192 to make 0 mean that there is no pitch bend. We scale this value and center it around 1 so that when we multiply it by the frequency we get an appropriately modified frequency value.

### Sound Generation

We decided to support two different waveforms. A sine wave produces a mellow tone, while a sawtooth produces a hard buzz sound common to synthesizers. These sounds are generated by the following code:

double saw(double phase) {
return 1.0/M_PI*phase-1.0;
}

int sine_gen(void* outputBuffer, void* inputBuffer, unsigned nBufferFrames, double streamTime, RtAudioStreamStatus status, void* userData)
{
if(status)
cout << "Stream underflow detected!" << endl;

double* buffer = (double *)outputBuffer;

for(unsigned i = 0; i < nBufferFrames; ++i) {

//double val = amplitude*sin(phase);
double val = amplitude*saw(phase);
*buffer++ = val;
*buffer++ = val;
phase += deltaphase;
if(phase > 2.0*M_PI)
phase -= 2.0*M_PI;
}

return 0;
}


We see that we iterate through all of the possible frames and output samples accordingly. Since the output buffer is an array of double precision floating point numbers and because we know that the two channels are interleaved, we compute the value (val) and then place it into both channels. Also, to maximize precision, we subtract 2*Pi from the phase whenever it gets larger than 2*Pi.

### Bugs and Deficiencies

Currently we are not preserving the pitch bend from note to note. This means that if you maintain a bend and start a new note, it won’t have the bend applied. Also, the code is incredibly unclean and hacky, though it does work well with my EWI-USB. In the next version, we will create a nice interface for handling MIDI and audio output from our synthesizer and we will make a better version of the same synthesizer.

Written by arlencox

November 21, 2010 at 9:42 pm

Posted in Uncategorized

## Proving Modus Tollens in Coq

Modus Tollens is quite similar to Modus Ponens except that it proves the contrapositive of the implication:

$\forall p, q. \; \neg q \wedge (p \rightarrow q) \rightarrow \neg p$

If $p \rightarrow q$, we know two things: modus ponens says that if $p$ is true, then $q$ must be true. Modus tollens says that if $q$ is false, then $p$ must be false.

We can prove modus tollens by assuming that $q$ is false and that $p \rightarrow q$ is true. Since $(p \rightarrow q) \leftrightarrow (\neg p \vee q)$, if $q$ is false, $p$ must be false, completing the proof.

Similar to what we did for Modus Ponens, we prove Modus Tollens in Coq. First we present the theorem to Coq. Coq shows the entire theorem as the goal.

Coq < Section Modus_tollens.

Coq < Theorem MT: forall P Q: Prop, ~Q /\ (P -> Q) -> ~P.
1 subgoal

============================
forall P Q : Prop, ~ Q /\ (P -> Q) -> ~ P


Next, we introduce three hypotheses from the forall and from the antecedent of the implication:

MT < intro P.
1 subgoal
P : Prop
============================
forall Q : Prop, ~ Q /\ (P -> Q) -> ~ P

MT < intro Q.
1 subgoal
P : Prop
Q : Prop
============================
~ Q /\ (P -> Q) -> ~ P

MT < intro HNQPQ.
1 subgoal
P : Prop
Q : Prop
HNQPQ : ~ Q /\ (P -> Q)
============================
~ P


Exactly like we did for Modus Ponens, we split the conjunction in the hypothesis into two hypotheses:

MT < destruct HNQPQ as [HNQ HPQ].
1 subgoal
P : Prop
Q : Prop
HNQ : ~ Q
HPQ : P -> Q
============================
~ P


This is where we have to do something different. We’re effectively moving the not P from the right hand side of the implication to the left hand side (negating it in the process) and introducing it as a hypothesis. Now we have P in our hypothesis, but we’re trying to prove false as our goal, so we need to use contradiction in the hypothesis to do so.

MT < intro HP.
1 subgoal
P : Prop
Q : Prop
HNQ : ~ Q
HPQ : P -> Q
HP : P
============================
False


We can find the required contradiction by using P and P -> Q along with modus ponens to show that Q must also by a hypothesis. It is introduced as an antecedent the the previous goal.

MT < generalize (HPQ HP).
1 subgoal
P : Prop
Q : Prop
HNQ : ~ Q
HPQ : P -> Q
HP : P
============================
Q -> False


We can then introduce Q as a new hypothesis because it is on the left hand side of the implication. This produces a contraction because we have both Q and ~Q in the hypothesis.

MT < intro HQ.
1 subgoal
P : Prop
Q : Prop
HNQ : ~ Q
HPQ : P -> Q
HP : P
HQ : Q
============================
False


We declare the contradiction and thus complete the proof.

MT < contradiction.
Proof completed.


Written by arlencox

November 3, 2010 at 6:49 pm

Posted in Uncategorized

## Proving Modus Ponens in Coq

with one comment

Recently, I have been learning how to use theorem provers.  Coq has been my tool of choice because of its clean syntax and relatively good (but still terrible) documentation.  It seems to be quite a powerful tool, but it is quite difficult to use.

When I am proving theorems, I apply a series of tactics.  Consider Modus Ponens:

$\forall p, q. \; p \wedge (p \rightarrow q) \rightarrow q$

If $p$ is false or if $p \rightarrow q$ is false, this equation is trivially true (hypothesis/antecedent is false).  So we only need to consider the cases where p is true and $p \rightarrow q$ is true.  Since $p$ is true, in order for $p \rightarrow q$ to be true, $q$ must be true.  Therefore $q$ is true.

So how do we do this proof in Coq?

First, we start a section.  It names a number of theorems.  We decided to call this Modus_ponens.

Coq < Section Modus_ponens.


Next, we declare the theorem.  We declare P and Q as propositional variables.  Note that this creates a goal for us to prove.  Our hypotheses are stated above the ====== line and our goal is stated below line.

Coq < Theorem MP: forall P Q: Prop, P /\ (P -> Q) -> Q.
1 subgoal

============================
forall P Q : Prop, P /\ (P -> Q) -> Q


We introduce our first hypothesis. While it wasn’t stated above, P must be a propositional variable. We can safely introduce it as a hypothesis since the quantification is only over propositional variables. Any variable that is not propositional need not be considered.

MP < intro P.
1 subgoal

P : Prop
============================
forall Q : Prop, P /\ (P -> Q) -> Q


We introduce Q to the hypotheses exactly as we did with P.

MP < intro Q.
1 subgoal

P : Prop
Q : Prop
============================
P /\ (P -> Q) -> Q


If the antecedent of the implication is false, the goal is trivially true. Therefore, we only consider the case where the antecedent is true. Therefore, we can add it to our hypotheses.

MP < intro HPPQ.
1 subgoal

P : Prop
Q : Prop
HPPQ : P /\ (P -> Q)
============================
Q


Since we now have P /\ (P -> Q) in our hypotheses, we should be able to complete the proof, but we can’t do anything with this hypothesis. It is a compound hypothesis. Therefore, we must break it down into two separate hypothesis. We will call these hypotheses HP and HPQ corresponding to hypothesis of P and hypothesis of P -> Q. The destruct command will split a conjunction in the hypothesis into two hypotheses.

MP < destruct HPPQ as [HP HPQ].
1 subgoal

P : Prop
Q : Prop
HP : P
HPQ : P -> Q
============================
Q


Now that we have all the hypotheses that we need, we can complete the proof. First, since we have that P -> Q, if we have P, we have Q. So we apply P -> Q to our current goal. This gives us a new goal of P.

MP < apply HPQ.
1 subgoal

P : Prop
Q : Prop
HP : P
HPQ : P -> Q
============================
P


Since our goal is P and we have a P in our hypothesis, we can finish the proof by saying that the goal is exactly HP.

MP < exact HP.
Proof completed.


Written by arlencox

October 30, 2010 at 1:27 am

Posted in Uncategorized