Exploration into Python

The last couple of weeks have been centered around learning about how programming can be beneficial for historians when it comes to conducting research. Specifically, the programming language called Python shows promise when wanting to use it to conduct historical work. Python is considered as a high-level programming language. This means that it is easy for humans to understand it, while also being easy for a computer to understand. Python sticks out on its own being very easy for beginners to pick up and use because python is constructed to be more human readable. Being able to easily read and understand python, increases the likely hood people will use it as a tool. The desirable trait of being easily understood is what makes python such a strong language because anyone can pick it up and begin programming.

My personal experiences with Python programming stems back further than just these last two weeks. I initially worked at a summer camp in the summer of 2018 where I was tasked with creating python scripts. Coming from a dominate C++ and Java background it was a bit of challenge to write code in a less robotic language such as python. Where python really shines compared to other programming languages is its removal of data types. Variables in the majority computer programming language require a data type, a data type is a certain piece of information that can be stored in the computer. The information can range from sentences and words called strings, to integer and fractional numbers.  Without the requirement for data typed variables, it opens the doors to store any kind of data in a variable. The ability to store any kind of data in variables leads to unique solutions to problems. The majority of the programming historian lessons focuses on this unique approach to programming. like a lot of lessons explore the concept of data manipulation. Humans do this all the time when a person reads a book, they are processing the information from the book, then they manipulate it to form opinions. The process of reading texts can speed up, which is shown within programming historian lessons. specifically, the stylometry lesson, where 86 federalist papers are processed by the computer and compared for specific writing styles. Interacting with these lessons, I feel like the programming historian does an excellent job at explaining how to proceed in creating pieces of code that work efficiently to complete larger analysis jobs that would take historians a long time to complete.

 

DicOfAuthor = {} # a data structure which will contain the author, and a list of all the papers for this author.

listOfAuthors = [] # a list which will be gained from the view.


def __init__(self, Authors, dataDir):
listOfFiles = []

authorFoundInPaper = False

self.listOfAuthors = Authors

counter = 0

for file in os.listdir(dataDir):

if (file.endswith(".txt")):


listOfFiles.append(file)

singleFilePath = dataDir + '/'

while (counter < len(listOfFiles)):	# counts through all the authors in the given directory.

singleFilePath += listOfFiles[counter]

with open(singleFilePath) as singleF:	# opens a file and searches for the correct author in order to store it in the dictionary.

for words in singleF:

for author in self.listOfAuthors:


if(author in self.DicOfAuthor):

self.DicOfAuthor[author].append(listOfFiles[counter])

authorFoundInPaper = True


else:

self.DicOfAuthor[author] = [listOfFiles[counter]]
authorFoundInPaper = True

#self.DicOfAuthor[self.listOfAuthors[counter]] = counter # we have now created a key for the data, time to now add papers.
singleFilePath = dataDir + '/'
counter += 1

def returnKeySizeOfDictionary(self):

return len(self.DicOfAuthor)


def toString(self):

return print(self.DicOfAuthor)

if __name__ == '__main__':
# for testing of this class!

ListOfAuthors = ["Madison", "Hamilton", "Jay"]

dataDir = "C:/Schoolprojects/history396/history396GraphicalProject/Data/data"

test = authorConstruction(ListOfAuthors, dataDir)

print(test.returnKeySizeOfDictionary())

print(test.DicOfAuthor["Madison"], "\nnext author and their writings1:\n",

test.DicOfAuthor["Hamilton"], "\nnext author and their writings2:\n",

test.DicOfAuthor["Jay"], "\nnext author and their writings3:\n")

test.toString()

A Python script I wrote which would search through files and compare if an author would match the specific file.

Programming is a skill that historians should be perusing, especially in the next century. Social history is the process of acquiring the story of the everyday person. To bring their experiences to light to aid in painting a bigger picture of a certain time in history. In the early 1900s, this process could be accomplished by finding prime sources. However not everyone kept records, and if they were kept they had to withstand the test of time before historians could possibly find it and record it. With the turn of the century, records are now being kept digitally, they are stored on large databases in which can be accessed. It would take a single person hundreds of years to access and read through all the content that has been stored on the web. That’s why it is important for historians to learn how to program. The only way to process huge amounts of data would be to build a special kind of algorithm which allows large amounts of data to be processed. Once data is easily able to be searched through, the ability to paint more accurate historical information will be presented. On a much smaller scale, this type of activity is clearly demonstrated in a programming historian lesson. The lesson From Html to List of Words Part One. This lesson explores the concept of taking websites content and being able to extract the data from the website, then to further store that data into a list and explore the text that the website contains.  This type of searching and sorting allows historians to quickly search through Benjamin Bowsey 1780 criminal trial transcript, extracting the information from it. Being able to actively interact with a piece of code to extract information from a historical document highlights the power that Python has.

Another reason as to why historians should be learning how to program is because the only person who knows how to think historically is a person who is a historian.  The software is built by programmers, which in most cases, not a historian which built the tool. This means that the majority of tools developed does not have a historical use in mind. Instead, the tool was developed for another audience which was adapted and is used by a historian. If Historians began learning how to program the tools that were built by historians would be better adapted to target the purposes of historical research.

One thought on “Exploration into Python

  1. Jim

    Thanks for this post Matt. I think your final point is very important. If historians don’t develop the skills and demand a seat at the table, scholars from other disciplines are going to develop new “Culturomics” methods that process historical data and “discover” there was a major flu in 1919 or a terrible war in 1939. Historians need to bring their expertise and knowledge of the past and their skepticism about simplistic discoveries using new tools to guide the development of tools for a better understanding of the past.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *