Python Code to Grab Two Letter State Codes
By Laurent Weichberger at Collibra

Python Code to Grab Two Letter State Codes

I needed to have just a list of two letter state codes for a Collibra Data Quality custom rule I was creating for a customer. I found a states.csv file online that had two columns, State Name and Code (Abbreviation). I didn't need the header, so I deleted the first line. I now had this:

states.csv file content

Article content

Then I write some Python code to read the file and write out just the list of codes, like this:

# Input: ~/Downloads/states.csv

import csv
path ='/Users/<Your_User_Name>'
with open(path + '/Downloads/states.csv', 'r') as file:
    csv_reader = csv.reader(file)
    with open(path + '/Downloads/state_codes.txt', 'a') as file:
        for row in csv_reader:
            code = '\'' + row[1] + '\','
            file.write(code)           

When I went to look for my results, I found this file as expected state_codes.txt:

'AL','AK','AZ','AR','CA','CO','CT','DE','DC','FL','GA','HI','ID','IL','IN','IA','KS','KY','LA','ME','MT','NE','NV','NH','NJ','NM','NY','NC','ND','OH','OK','OR','MD','MA','MI','MN','MS','MO','PA','RI','SC','SD','TN','TX','UT','VT','VA','WA','WV','WI','WY',        

I copied that into my DQ custom rule, omitting the final comma :)

Voila!

For more information:

Laurent Weichberger (Big Data Bear)

Principal Customer Engineer

laurent (dot) weichberger (at) collibra (dot) com


Article content
Big Data Bear


Another option. Since it looks like you downloaded a file. Simply add the file to CDQ and have the rule be “where state_cd IN (states from state_codes.txt)

To view or add a comment, sign in

More articles by Laurent Weichberger

Explore content categories