Editor’s Note: We’re finishing up our first semester of the Paleantology project! One big problem to managing big data is making sure you’re able to organize everything, and make that data accessible. I wrote a little about that here. Another issue is making sure data are entered without error. To that end, Krishna Gandikota joined the lab for the fall to learn more about programming. He worked on creating a software to automatically download and parse taxonomy information for the ants in our dataset. He’s leaving us for the University of Iowa biomedical engineering program. He worked with me on a piece of software to parse ant taxonomy information from AntWiki. Good luck in Iowa City, Krishna!
What does the software you worked on in the lab do?
The software that we created downloads taxonomy data for the ants in the fossil and molecular datasets. It reads a txt file containing the names of taxa, and converts them into urls for the AntWiki database. The urls are stored in a dictionary structure. Furthermore, it makes an http request to a URL and then uses Beautiful Soup and regular expressions to get specific information from the webpage. The last step of the program consists of writing out the new ant data to our ant taxonomy database.
What have you learned?
Initially I thought python could be used for writing normal code. Through this project, I learned that python has more capabilities for scripting and making http requests. The language is easy to handle and is much simpler than the other languages I have been exposed to. Overall my experience with the project as well as the language was beneficial.
What do you still need to learn?
The skills I need to develop include the ability to discern the structure of a function I want to create, including performance tuning. I need to learn how to code without making syntax errors. Memorizing the basic functions and syntaxes will make the coding process more efficient, and will come with more practice.