Webscraping and APIs Using Python for Social Scientists

Thumbnail Image
If you need an accessible version of this item, please email your request to iusw@iu.edu so that they may create one and provide it to you.
Journal Title
Journal ISSN
Volume Title
Indiana University Workshop in Methods
In recent years, social scientists have increased their efforts to access new datasets from the web or from large databases. An easy way to access such data are Application Programming Interfaces (APIs). This workshop introduces techniques for working with APIs in Python to retrieve data from sources such as Wikipedia or The New York Times. It is intended for researchers who are new to working with APIs, but are familiar with Python or have completed the Introduction to Python workshop. Python is best learned hands-on. To side step any issues with installation, we will be coding on Jupyter Notebooks with Binder. This means that participants will be able to follow along on their machines without needing to download any packages or programs in advance. We do recommend requesting a ProPublica Congress API key in advance (https://www.propublica.org/datastore/api/propublica-congress-api). This allows participants to run the API script on their own machines.
Helge-Johannes Marahrens is a doctoral student in the department of Sociology at Indiana University. He recently earned an MS in Applied Statistics and is currently working toward a PhD in Sociology. His research interests include cultural consumption, stratification, and computational social science with a particular focus on Natural Language Processing (NLP). Anne Kavalerchik is a doctoral student in the departments of Sociology and Informatics at Indiana University. Her research interests are broadly related to inequality, social change, and technology.
Link(s) to data and video for this item