Saturday, January 05, 2013

Calculating NMEA sentence checksums with Python


Another feature I wanted to add to my NTP/GPS project I'm doing with my Raspberry Pi is to validate the NMEA sentences I'm reading in by checksum'ing the NMEA sentence and comparing it with the XOR checksum provided from the GPS receiver.

This is especially important if you plan on writing an application that is reading $GPGGA, $GPGLL or $GPRMC sentences for timekeeping purposes and want to validate the sentence you read for time is legit or not.


Background on NMEA Checksums


After reading some documentation on NMEA sentence types and, more importantly, about checksums, the process is actually quite easy and can be coded up fairly easily.

Here's the main blurb from the NMEA documentation about the checksum and how to calculate it:
Programs that read the data should only use the commas to determine the field boundaries and not depend on column positions. There is a provision for a checksum at the end of each sentence which may or may not be checked by the unit that reads the data. The checksum field consists of a '*' and two hex digits representing an 8 bit exclusive OR of all characters between, but not including, the '$' and '*'. A checksum is required on some sentences.
Not so hard, right? We need to exclusive OR (XOR) all of the characters (INCLUDING the commas) between the '$' and the '*'. 

Let's do it!


NMEA Sentence Breakdown


Here's a NMEA sentence example that I will use to show the XOR process on:

$GPGGA,174134.000,4345.9112,N,09643.8029,W,1,05,2.7,452.6,M,-27.1,M,,0000*60

The color breakdown is:

            Black: Two positional characters in the NMEA sentence we need to read our characters 
                      from between
            
            Green:  Characters we need to XOR
            Blue:  The calculated checksum we need to compare our calculated checksum against


Bitwise XOR on NMEA Sentences


As the documentation states, we need to exclusive OR (XOR) all of the characters (INCLUDING the commas) between the '$' and the '*'.

The process is quite simple: We want to take each character and XOR it with the previous XOR'd output from the last character. The very last character that is XOR'd will be the final checksum value that you'd then compare with the checksum value.

If you're a bit fuzzy on on XOR, the bitwise operator in most programming languages is ' ^ ' and the rules are as follows:


  • 1 ^ 1 = 0
  • 1 ^ 0 = 1
  • 0 ^ 1 = 1
  • 0 ^ 0 = 0

One thing to note, when you start of XOR'ing, you'll want to compare your first character with zero (e.g. 0, 0xFF, 0b0, etc.) so on the next character iteration (which would be the 2nd character in the NMEA string) will XOR against the binary value of the first character.  Why is this so?  If you look at the XOR rules above, we only need one bit 'on' (or '1') on to bit-flip-it and keep it 'on' (or '1').  So if we XOR against '0', we get our original value back.

Let's use our example sentence above to go through a handful of binary XOR iterations of what we'll be accomplishing in code:

0b0000000     0
0b1000111     G
----------
0b1000111     XOR output
0b1010000     P
----------
0b0010111     XOR output
0b1000111     G
----------
0b1010000     XOR output
0b1000111     G
----------
0b0010111     XOR output
0b1000001     A
----------
0b1010110     XOR output
0b0101100     ,
----------
0b1111010     XOR output
0b0110001     1
----------
0b1001011     XOR output

...

0b1010000     XOR output
0b0110000     0
----------
0b1100000     XOR output
0b0110000     0
----------
0b1010000     XOR output
0b0110000     0
----------
0b1100000     Our checksum (96 decimal, 0x60 HEX)

As you can see, we ended up with '0x60' which is the same as our example sentence above of '*60'. So we were able to validate this NMEA sentence!

NMEA Checksum Python Code


Now that we got the explanation out of the way, let's look at the code.  It's really simple:
def chksum_nmea(sentence):
    
    # This is a string, will need to convert it to hex for 
    # proper comparsion below
    cksum = sentence[len(sentence) - 2:]
    
    # String slicing: Grabs all the characters 
    # between '$' and '*' and nukes any lingering
    # newline or CRLF
    chksumdata = re.sub("(\n|\r\n)","", sentence[sentence.find("$")+1:sentence.find("*")])
    
    # Initializing our first XOR value
    csum = 0 
    
    # For each char in chksumdata, XOR against the previous 
    # XOR'd char.  The final XOR of the last char will be our 
    # checksum to verify against the checksum we sliced off 
    # the NMEA sentence
    
    for c in chksumdata:
       # XOR'ing value of csum against the next char in line
       # and storing the new XOR value in csum
       csum ^= ord(c)
    
    # Do we have a validated sentence?
    if hex(csum) == hex(int(cksum, 16)):
       return True

    return False

There you have it!

6 comments:

Anonymous said...

Best description that I've found on the web of exactly how to handle the NMEA data and XOR each character in the NMEA sentences. THANK YOU!!

Anonymous said...

Awesome, found several bad examples of this online.

Anonymous said...

The behavior of removing the new line had mixed results. Here's an alternative version to try.


def nmea_chksum(sentence):
sentence = sentence.rstrip('\n')
cksum = sentence[len(sentence) - 3:]
chksumdata = re.sub("(\n|\r\n)","", sentence[sentence.find("$")+1:sentence.find("*")])
csum = 0
for c in chksumdata:
csum ^= ord(c)
if hex(csum) == hex(int(cksum, 16)):
return True
return False


nmea_chksum(sentene)

SteveMann said...

Finally, something I can add to...

This code works except that cksum contains the checksum flag * and int(cksum, 16) fails.

Change
cksum = sentence[len(sentence) - 3:]

to
cksum = sentence[len(sentence) - 2:]

because the NMEA checksum is always the last two characters on the line.

Алексей said...

Thank you! It was very helpful for me!

Baltimore Bill said...

Thanks for this - very useful also on micropython with the Raspberry Pi Pico microcontroller.
I now have a working rouine, albeit with some mods to the string handling to remove "\r\n" which can be troublesome.