In this notebook we scrape FOMC statements and minutes from FOMC meetings to find specific words. As usual, this is purely illustrative of what Python can do when it comes to text mining. Professor Amy Handlan who just joined Brown University shows in her job market paper how this general type of tools combined with machine learning techniques can be used to measure the stance of monetary policy.
import requests # The de facto library for making HTTP requests in Python url='https://www.federalreserve.gov/newsevents/pressreleases/monetary20210317a.htm' # one FOMC statement, change as needed #open with GET method content=requests.get(url).text # this turns the entire HTML behind the page into a long text string count=0 searchword="inflation" # whatever word or for that matter, list of words we're looking for. for word in content.split(): # splits the giant HTML string we created above into words if searchword in word.lower(): count+=1 print("The word " + searchword + " appears " +str(count) +" time(s) in the statement.")
The word inflation appears 10 time(s) in the statement.
This was pure brute force. Now we use BeautifulSoup to only extract the specific part of the HTML mumbo jumbo we need first to limit the risk of a miscount.
from bs4 import BeautifulSoup # the go-to library for scraping url=requests.get('https://www.federalreserve.gov/newsevents/pressreleases/monetary20210317a.htm').text soup=BeautifulSoup(url, 'lxml') # turns the HTML into a beautiful soup object statement= soup.find('div', class_="col-xs-12 col-sm-8 col-md-8").text # Upon inspecting the HTML of the page, we see that # the statement starts with an HTML tag of the <div> type followed by the class specificatio above # We're telling beautiful soup that this is the section of the page we want print(statement) # just to check that it worked # statement.split()
The Federal Reserve is committed to using its full range of tools to support the U.S. economy in this challenging time, thereby promoting its maximum employment and price stability goals. The COVID-19 pandemic is causing tremendous human and economic hardship across the United States and around the world. Following a moderation in the pace of the recovery, indicators of economic activity and employment have turned up recently, although the sectors most adversely affected by the pandemic remain weak. Inflation continues to run below 2 percent. Overall financial conditions remain accommodative, in part reflecting policy measures to support the economy and the flow of credit to U.S. households and businesses. The path of the economy will depend significantly on the course of the virus, including progress on vaccinations. The ongoing public health crisis continues to weigh on economic activity, employment, and inflation, and poses considerable risks to the economic outlook. The Committee seeks to achieve maximum employment and inflation at the rate of 2 percent over the longer run. With inflation running persistently below this longer-run goal, the Committee will aim to achieve inflation moderately above 2 percent for some time so that inflation averages 2 percent over time and longerâterm inflation expectations remain well anchored at 2 percent. The Committee expects to maintain an accommodative stance of monetary policy until these outcomes are achieved. The Committee decided to keep the target range for the federal funds rate at 0 to 1/4 percent and expects it will be appropriate to maintain this target range until labor market conditions have reached levels consistent with the Committee's assessments of maximum employment and inflation has risen to 2 percent and is on track to moderately exceed 2 percent for some time. In addition, the Federal Reserve will continue to increase its holdings of Treasury securities by at least $80 billion per month and of agency mortgageâbacked securities by at least $40 billion per month until substantial further progress has been made toward the Committee's maximum employment and price stability goals. These asset purchases help foster smooth market functioning and accommodative financial conditions, thereby supporting the flow of credit to households and businesses. In assessing the appropriate stance of monetary policy, the Committee will continue to monitor the implications of incoming information for the economic outlook. The Committee would be prepared to adjust the stance of monetary policy as appropriate if risks emerge that could impede the attainment of the Committee's goals. The Committee's assessments will take into account a wide range of information, including readings on public health, labor market conditions, inflation pressures and inflation expectations, and financial and international developments. Voting for the monetary policy action were Jerome H. Powell, Chair; John C. Williams, Vice Chair; Thomas I. Barkin; Raphael W. Bostic; Michelle W. Bowman; Lael Brainard; Richard H. Clarida; Mary C. Daly; Charles L. Evans; Randal K. Quarles; and Christopher J. Waller. Implementation Note issued March 17, 2021
# Now the rest is the same searchword="inflation" count=0 for word in statement.split(): if searchword in word.lower(): count=count+1 print("The word " + searchword + " appears " +str(count) +" time(s) in the statement.")
The word inflation appears 10 time(s) in the statement.
Now, out of curiousity, what if we mine the entire minutes?
url='https://www.federalreserve.gov/monetarypolicy/fomcminutes20210317.htm' content=requests.get(url).text count=0 searchword="inflation" # whatever word or for that matter, list of words we're looking for. for word in content.split(): # splits the giant HTML string we #print(word) if searchword in word.lower(): count+=1 print("The word " + searchword + " appears " +str(count) +" time(s) in the minutes.")
The word inflation appears 64 time(s) in the minutes.