criminal justice internships summer 2021 washington, dc
Installation. Although it s quite ease to parse xml by python elementTree, i still . Extract standardized financial statements from any 10-K and 10-Q filing. Download filings from EDGAR ; 3. sec-api is a Python package for querying the entire SEC filings corpus in real-time without the need to download filings. However, natural language processing (NLP) enables us to analyze financial documents such as 10-k forms to forecast stock movements. Tim Loughran and Bill McDonald, 2016, Textual Analysis in Accounting and Finance: A Survey, Journal of Accounting Research, 54:4,1187-1230. (PDF) Scraping EDGAR with Python - ResearchGate This post on Python SEC Edgar Scraping Financial Statements is a bit different than all the others in my blog.I just want to share with all of you a script in order to scrap financial statements from the SEC Edgar website. 'Application of Natural Language Processing (NLP) to predict firm performance from 10K and 10Q statements' for PluribusLabs Jan 2016 • Analyzing 10K financial statements using NLTK text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, in order to assess the performance of public companies. finreportr is a web scraper written in R that allows analysts to query data from the U.S. Securities and Exchange Commission directly from the R console. SEC EDGAR Filings API 10-k forms are annual reports filed by companies to provide a comprehensive . from edgar import Company company = Company ("Oracle Corp", "0001341439") tree = company. Retrieving these filings from SEC's EDGAR service is complicated, and parsing these forms into plaintext for further analysis can be very time-consuming. Schematic of databases (Image by Author) Python SEC Edgar — Python SEC Edgar 0.1.0 documentation Parsing Python Inside Python. Searches can be conducted either by stock ticker or Central Index Key (CIK) . Once the code is built, it will be very easy to use. Developed a python pipeline to programmatically generate the URL and extract data from the . parse_submission() - takes a full submission SGML document and parses out component documents. Using NLP on a company's annual reports to predict near ... Sentiment counts are based on the Loughran-McDonald dictionary. d) Then the page of the filing (10-K) is loaded using the URL obtained in step (c). NOTE: Before you start, you should make sure that Python 2.7 is already installed in your computer (For A cli tool called sec_edgar_download supports downloading and indexing, in a local sqlite3 database, the RSS files; as well as downloading specific 10-K and . I've recently been working on this Statement Parser and would love some feedback on whether it's an effective tool for value investing. My goal is to collect the number of occurrences in the visible text body of the 10-K statements of certain keywords . How to Parse 10-K Report from EDGAR (SEC). It's an evolving area of natural language processing that helps to make sense of large volumes of text data. Upon creation, all latest SEC Form 13F filings are downloaded automatically into a folder in XML format and the BeautifulSoup package is used to parse the relevant information from the documents into DataFrames. . 3.1 Extract all items reported in 8-K filings since 2004 ; 3.2 Find all 8-K filings with Item 1.01 and/or Item 2.03 ; 3.3 Nini, Smith and Sufi (2009) Use SAS . get_documents (tree, no_of_documents = 5). In order to compare the portfolio difference of the two most recent filings use the following methods: or. For example, IBM's 10-K filing on 20120228 lists the core 10-K document in HTML format, ten exhibits, four jpg (graphics) files, and six XBRL files. Data Retrieval We extracted 10-K's in HTML format from the EDGAR database of SEC filings. and filing (e.g., 10-K) and obtains the URL path for the filing (similar to the logic in Program2.py). In this series, we begin the top. From the abstract: To navigate the SEC.gov website, you should go to "company filings" near the top right, then use the "fast search" by typing the company's ticker symbol, like AAPL for Apple. the 10-K filing, subsequent documents are exhibits. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are extremely unfriendly for researchers nowadays. Installation. Downloading the early years - ZIPping the XBRL files on our local machine 11 If we want to download data from the early years, we need to use two additional Python packages: (a) The ElementTree XML parser, because feedparser cannot handle multiple nested elements for the individual filings (b) The zipfile package so that we can ZIP the . To get a filing, you have to agree to terms, complete a CAPTCHA, and parse a PDF file. In this tutorial we explore how we can use Python and socket.io to create a real-time live feed of new filings published on SEC EDGAR. The text version of the filings provided on the SEC server is an aggregation of all information provided in the browser-friendly files also listed on EDGAR for a specific filing. A Python application used to download and parse complete submission filings from the sec.gov/edgar website. In this first post, we are going to build a Python script that will allow us to retrieve annual or quarterly reports from any company. The file is called "company.idx" and has the names, date, and link from all financial reports in 2021. Parse the HTML to find the URL(s) of the report(s) of interest. -Investopedia. 2021-11-28. Ask Question Asked 1 year, 8 months ago. By using python-edgar and some scripting, you can easily rebuild a master index of all filings since 1993 by stitching quarterly index files together. I used python3.6 # python36 pip install -r requirements.txt and filing (e.g., 10-K) and obtains the URL path for the filing (similar to the logic in Program2.py). Java made way for Python with a 2.11 percentage point drop to 10.46%. Python SEC Edgar. (3) Obtain html files by URL. This information is usually reported under "Part 2 Item 5 Market for Registrant's Common Equity, Related Stockholder Matters and Issuer Purchases of Equity Securities" in 10-Ks and "Part 2 Item 2 Unregistered Sales of Equity Securities and Use of Proceeds". While edgarWebR is primarily focused on providing an interface to the online SEC tools, there are a few activities for handling filing documents for which no current tools exist. An example of some forms you may be interested in here would be the 10K and 10Q forms. Answer (1 of 3): Simply use these python libraries: https://github.com/lukerosiak/pysec https://pypi.python.org/pypi/SECEdgar https://github.com/altova/sec-xbrl . References: Bonsall, S., A. Leone, B. Miller, and K. Rennekamp. sec-edgar-downloader. Once you have a copy of the source, you can install it with: $ pip install -r requirements.txt Fast Solr search over 4 million filings for all 10-K, 10-Q, 8-K, IPO Prospectuses, Proxy filings, and SEC Correspondences since 1994 Derived Datasets: Using Regular Expressions to Search SEC 10K Filings. Plus, you can access all the filings through an FTP site. Machine learning models implemented in trading are often trained on historical stock prices and othe r quantitative data to predict future stock prices. 10-K form: Business, Risk, MD&A. Explored the SEC EDGAR website for all firms' 10-Ks included in the Dow Jones Industrial Average filed during the calendar year 2016; determined and tabulated the following information for each filing: In this article I will show how to collect and parse 13F filing data from the SEC. Then use the `.finditer ()` method to match the regex to `document ['10-K']`.\n", "Note that Item 1B & Item 8 are added to find out end of section Item 1A & Item 7A subsequently." "Notice that each item is matched twice. The 10K is the annual report, and the 10Q is a quarterly report. This is because each item appears first in the index and then in the corresponding section. Extracted tables from Edgar SEC, to find the 10-K and 10-Q filings using Beautiful Soup and HTML Parser. Viewed 296 times 1 I am trying to parse the text section of the SEC Edgar texts in Python 3 . Image credit: New York Times. type: The general type of the document, extracted from the TYPE header and cleaned up (so 10-K405 --> 10-K) type_exact: The exact text extracted from the TYPE field; documents: Array of all the documents (between tags). ), all historical observations will be updated and not recording historical state . For each report of interest, send a request to the report's URL. (2) Read in the relevant quarterly 10-K rows per company. Blog. Dec. 3, 2021. This section explains how to parse HTML using Python and the Beautiful Soup package. I need someone to convert a fairly complex XML file to CSV with R. I will supply the XML file as well as the previously converted CSV.I need you to write the script to convert the XML file to match the previous CSV. As a side project, which now seems to be taking over most of my life, I parse the 10K filings and extract the Risk Factor sections and use an ML model to extra. 6 ways virtual sellers can stand out on LinkedIn; Nov. 30, 2021. to a new txt file in NotePad, save it as txt, and then change the extension to "htm" or "html", and open it with Chrome or IE. I had read this paper Lazy Prices, which described a methodology for parsing Management Discussion & Analysis from 10-K and 10-Q SEC filings. OpenEDGAR's Index Parser, Filing Parser, and Filing Document Parser are designed with the flexibility to parse even these older SGML tags that are often found in some SEC filings. Texutal analysis on SEC filings Texutal analysis on SEC filings Table of contents . 1. This works pretty terribly since companies have so many different ways they can write the data. The problem with SEDAR is that they don't really make it easy to extract the data. Also, I will need you to send me the source code for the conversion, so it must be written in R. It is a pretty straightforward project and there will be more projects available . This post demonstrates how to do the following in a notebook titled Dashboarding SEC Filings available from SageMaker JumpStart: Retrieve parsed 10-K, 10-Q, 8-K filings. Parse the response to download the desired report. Build a master index of SEC filings ; 2. Python's move to top spot on the Tiobe index was a result of other languages falling in searches rather than Python rising. In this article I will show how to collect and parse 13F filing data from the SEC. The function parse_13f_filing() parses 13f forms to extract data regarding institutional investors and their portfolios. # Open the company idx file index_file = open ("company.idx").readlines () #Just confirming the header of the file print . 180, 787 10-K filings 8 seconds on average to download single filing-----1 . This dataset is freely available. Analytics Suite, to develop custom-tailored datasets from all SEC filings, parsing millions of regulatory reports, WRDS Quant Alpha, a powerful tool to discover and test unknown stock anomalies, and the Wharton School's OTIS, an online trading and investment simulator—WRDS is the global gold standard in data management and Our procedure (1) Retrieve quarterly tab-separated files from the EDGAR index. The data model, clients, and parsers provide the building blocks for constructing research databases from EDGAR. The goal for this project is to make it easy to get filings from the SEC website onto your computer for the companies and forms you desire. Note that you will need to handle the case of 20-F, which is the equivalent for foreign companies. SEC EDGAR filings API | Query API to access historical filings in EDGAR archives | | Live feed streaming | Filing mapped to ticker, CIK and SIC | Over 150 filing types | Filings from 1993 to present | JSON formatted | Supports Python, Node.js, React, C++ and many more | 10-Q, 10-K, 8-K, 4, S-1 | Free trial After calling the url of the desired stock, you will want to go over all recent filings and look for 10-K/ 10-Q (or any other file name you need to scrape data from). Extracting Textual Data from 10-K This tutorial will guide you through the process of running a set of four Python scripts to extract textual data -- the Item 1 section -- from Edgar's 10-K files. HDS is reader-supported and we may receive compensation from affiliate links on this site at no extra cost to you . Hidden cost extractor for SEC filings. sec-filings-database Financial market api streaming api for developers. Python Dependencies (i.e., modules you must download that are accessed by the program): MOD_Load_MasterDictionary_v2020.py - module to load Loughran-McDonald master dictionary . The function parse_10k_filing() parses 10-K forms to extract the sections: business description, risk, and management discussion and analysis. Regular expressions, or "regex", are text matching patterns that are used for searching text. Obtaining easily parse-able sec filings data. Getting structured SEC EDGAR data OKFN discussion forum. (3) How does risk in company 10-K reports correlate with stock return risks? Topic modeling can streamline text document analysis by extracting the key topics or themes within the documents. CorpWatch API seems to do exactly what we need but it's maybe unupdated, need to drop them an email You will find that is exactly the html file. But if you want to extract data programmatically, the last option is the most practical. Parsing Tools. • I maintained and remodelled portfolios on Local Services Ads for companies in the United States, building insights and delivering APIs for smooth data collection. With this file in hand, we are going to write a command to download the first 100 10-K files that appear on the list. . Centralized storage & parsing of SEC filing contents 19.8 million+ records of electronic filings with the SEC since 1994, as well as the text, html, and pdf filings available on wrds server. According to Investopedia, Core Earnings are an important way to determine the true profitability of a company's underlying business. During this series of posts Scraping SEC Edgar with Python, we are going to learn how to parse company financials from SEC Edgar using Python.. It is a quarterly filing required of institutional investment managers with over $100 million in qualifying assets. In the case of SEC 10k filings, regex can greatly assist the search process. As of now I've been scraping nasdaq's sec filings and trying to parse the plain text pdfs by searching for key words. I am a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising . Andriy Bodnaruk, Tim Loughran and Bill McDonald, 2015, Using 10-K Text to Gauge Financial Constraints, Journal of Financial and Quantitative Analysis, 50:4, 1-24. We will simply pass the name of a company and the script will . The data model, clients, and parsers provide the building blocks for constructing research databases from EDGAR. 2013-2016 Cleaned/Parsed 10-K Filings with the SEC - dataset by jumpyaf | data.world. sec-edgar-downloader is a Python package for downloading company filings from the SEC EDGAR database . ¶. With an 11.27% share of searches, it was flat, while second place language C fell 5.79% percentage points compared to October last year down to 11.16%. See the list of supported form types here. A financial analyst's time is valuable - it shouldn't be wasted on performing manual data entry. RAW Paste Data Download Here - https://is.gd/nRQmn9 (Copy and Paste Link) In this section we are going to download K files from the SEC Edgar website. It aims to eliminate time wasters from a financial analyst's workflow, such . EDGAR posts any PDF versions of the filings, the XML documents, and the full text of any filing. 13F holdings API included. Keyword search results : This directory is created upon use of the searchFilings function and saves the extracted filing search results in HTML . All +150 filing types are supported, eg 10-Q, 10-K, 4, 8-K, 13-F, S-1, 424B4 and many more. An existing Python package was used to scrape this data. d) Then the page of the filing (10-K) is loaded using the URL obtained in step (c). -Investopedia. • Worked on the SEC filings 13-F to scrape XML tables using Python parsing and store the cleaned data on MySQL server. We can comfortably get, at this point, most of the filings we want from a range of different directories on the SEC website. Visit Accessing EDGAR Data to know more about EDGAR. edgar-10k-mda. We use the streaming API provided by sec-api.io to establish a… That is, the first document in the txt file is the html file, i.e., the main body of the 10-K filing. Firm Historical Headquarter State from SEC 10K/Q Filings¶ Why the need to use SEC filings?¶ In the Compustat database, a firm's headquarter state (and other identification) is in fact the current record stored in comp.company.This means once a firm relocates (or updates its incorporate state, address, etc. This repo contains some python code I used to download form10k filings from EDGAR database, and then extract the MDA section from the downloaded form10k filings heuristically. Major organizational/company events that would necessitate the filing of a Form 8-K include bankruptcies or receiverships, material impairments, completion of acquisition or disposition of assets, and . The Form 8-K is what a company uses to disclose significant developments that occur between filings of the Form 10-K or Form 10-Q. First, use EDGAR to search the company of interest. get_all_filings (filing_type = "10-K") docs = Company. Parsing SEC Filings (Newer Ones) in Python | Part 5 December 30, 2019 admin This is the final video of our series, and we close it off by discussing strategies to perform more complex parsing. Count keywords in SEC Edgar 10-K filings text-body with Python. sec-edgar-downloader ¶. It is a quarterly filing required of institutional investment managers with over $100 million in qualifying assets. A small python library which downloads companies 10-K and 10-Q xbrl format filings from the SEC's Edgar website. GitHub Gist: instantly share code, notes, and snippets. The master index file can be then feed to a database, a pandas dataframe, stata, etc. The SEC filings index is split in quarterly files since 1993 (1993-QTR1, 1993-QTR2.). from edgar import Company, TXTML company = Company . By default, EDGAR provides all of the reports available for a company, regardless of the source. The Python program web crawls to obtain URL paths for company filings of required reports, such as 10-K. The related parsing code to parse the 10-K filings is available on Samuel Bonsall's website. Hi, We have a programming task we would like to outsource - we want someone to write is code in Python to parse SEC 10K filings (downloadable from the SEC's EDGAR database) for a list of ~1,000 companies (we can provide the CIK codes in csv which are the unique identifiers the SEC uses) and tell us how many words are in certain sections of the filings (the filings are in standardised format . 10-K and the first in the txt file. I have in total 90,000+ forms to parse, so it won't be feasible to do it manually. Extracted large amounts of data from SEC EDGAR. Web Scraping. The program then performs a textual analysis and counts the number of occurrences of words in the filing that reflect, for example, uncertainty (or any other quality specified by the researcher). 0th is typically the main form, i.e. We only request that if you use a data you reference our paper and acknowledge the data source. To get a company's latest 5 10-Ks, run. Regular text - Data provided in regular files (*.txt) Web pages - Data to be viewed in a browser (*.htm) XBRL - Data provided in XBRL-formatted files (*.xml) The first two options are fine if you want to read report data yourself. The goal for this project is to make it easy to get filings from the SEC website onto your computer for the companies and forms you desire. A Python application used to download and parse complete submission of all filings are stored in index files # so need to download these index files. Example textual analyses . I will only explain how it works in a Youtube video due to the low value added on writing an article for it. The stock price database provided 160,926 potential target events of which 38,807 could be matched with the downloaded annual report database. 8K Forms. Company API API change history. Answer (1 of 4): Whilst the data is freely available through the SEC RSS feeds, it still take a lot to read through the various filings. (2017). A collection of RESTful methods that returns various financial data for a requested company including balance sheets, stock quotes , company look-up utilities and more. First, use EDGAR to search the company of interest. pip install edgar. EDGAR. I would suggest directing our research efforts to html-format filings with the help of BeautifulSoup. In this article, I show how to apply topic modeling to a set of earnings call transcripts using a popular approach called Latent Dirichlet Allocation (LDA). SEC EDGAR Downloader , Release 4.2.0 sec-edgar-downloader is a Python package for downloadingcompany filingsfrom theSEC EDGAR database . Active 1 year, 8 months ago. If you copy 10-K …. Tutorial 2. A small library to access files from SEC's edgar. Python SEC Edgar¶ A Python application used to download and parse complete submission filings from the sec.gov/edgar website. Example. Generic_Parser.py Program to generate sentiment counts for all files contained within a specified folder. Python SECEdgar download SEC filing files (only 10-K, no 20-F of foreign ADR companies) Scraping SEC Filings download SEC filings. You can use the SEC CIK lookup tool if you cannot find an appropriate ticker. Newly published filings are accessible in real-time; XBRL-to-JSON converter and parser API. The Edgar site maintains monthly RSS feeds describing each of the filings. OpenEDGAR's Index Parser, Filing Parser, and Filing Document Parser are designed with the flexibility to parse even these older SGML tags that are often found in some SEC filings. Build a master index of SEC filings since 1993 with python-edgar. For example, HTML view of 10-K statement in the previous example can be found on filepath "Edgar filings_HTML view- > Form 10-K- > 38079- > 38079_10-K_2005-03-15_0001047469-05-006546.html". Our 2021 Staff Picks: The year's best Prezi videos; Nov. 30, 2021 Specifically all I'm trying to do (at the moment) is gather the historical shares . Case of SEC filings ; 2 do it manually updated and not recording state! Eliminate time wasters from a financial analyst & # x27 ; t really make it easy to extract regarding! Crawls to obtain URL paths for company filings from the SEC CIK lookup tool if you can use SEC. Filings from the SEC EDGAR 10-K filings 8 seconds on average to download and parse 13F filing data from SEC... Investors and their portfolios download and parse 13F filing data from the EDGAR site maintains monthly RSS feeds describing of... A financial analyst & # x27 ; s EDGAR website 10Q forms XML tables using Python and the Soup! Not recording historical state evolving area of natural language processing ( NLP ) enables us analyze! Use a data you reference our paper and acknowledge the data SGML document and parses out documents! S ) of interest for it rows per company CIK lookup tool if you can not find appropriate... In real-time ; XBRL-to-JSON converter and Parser API interest, send a to... Search results: this directory is created upon use of the filings existing Python package was used download! Text data of Accounting research, 54:4,1187-1230 of interest Gist: instantly share code, notes, K.. And parses out component documents is available on Samuel Bonsall & # x27 ; s website conducted by! To terms, complete a CAPTCHA, and the Beautiful Soup package Then in the index and Then the... Times 1 i am trying to parse 10-K report from EDGAR ( SEC ) a Python. Sec filings Table of contents, i still, natural language processing that to... ( SEC ) institutional investors and their portfolios 10K filings, regex can greatly assist the process! A quarterly filing required of institutional investment managers with over $ 100 million in qualifying.... Our paper and acknowledge the data model, clients, and the is! Are accessible in real-time ; XBRL-to-JSON converter and Parser API Journal of Accounting research, 54:4,1187-1230 compare! 8 seconds on average to download and parse 13F filing data from the EDGAR site monthly. The visible text body of the Form 8-K is what a company and the script will parse, so won. Extract standardized financial statements from any parsing 10k filings python and 10-Q xbrl format filings from the sec.gov/edgar.! Are accessible in real-time ; XBRL-to-JSON converter and Parser API and Finance: a Survey Journal! On the SEC EDGAR texts in Python 3 be feasible to do it manually developments that between... The page of the 10-K and 10-Q filing describing each of the reports available for a and... In company 10-K reports correlate with stock return risks streamline text document analysis by extracting the Key topics themes! Parsing and store the cleaned data on MySQL server Form 10-Q not recording historical.. Text document analysis by extracting the Key topics or themes within the documents the portfolio of... Large volumes of text data ( 10-K ) is loaded using the URL ( s ) of interest send. Filing, you have to agree to terms, complete a CAPTCHA, and the script.... Target events of which 38,807 could be matched with the help of BeautifulSoup sense of large volumes of data! Filing, you can use the following methods: or URL and extract data from the website. Use EDGAR to search the company of interest, send a request to the report #! Library which downloads companies 10-K and 10-Q filings using Beautiful Soup and HTML Parser ; are! Models implemented in trading are often trained on historical stock prices and othe r quantitative data predict... Documents such as 10-K and Parser API s in HTML ADR companies ) SEC... In real-time ; XBRL-to-JSON converter and Parser API this site at no extra to. Help of BeautifulSoup which is the equivalent for foreign companies ( 1993-QTR1 1993-QTR2. In the case of SEC filings 13-F to scrape XML tables using Python parsing and store the cleaned on... Dataset by jumpyaf | data.world ( 3 ): Simply use these Python libraries: https: //pypi.python.org/pypi/SECEdgar:. Either by stock ticker or Central index Key ( CIK ) value added on writing an for. With python-edgar related parsing code to parse the HTML to find the 10-K filings is available on Samuel &... In company 10-K reports correlate with stock return risks: Bonsall, S., A.,! Of foreign parsing 10k filings python companies ) Scraping SEC filings index is split in quarterly files 1993. Soup and HTML Parser each item appears first in the index and Then in the corresponding section contained within specified... Forms to extract data programmatically, the last option is the most.. To do it manually 8 months ago constructing research databases from EDGAR import company, of. This directory is created upon use of the report ( s ) of the SEC - by! Statements of certain keywords an existing Python package for downloadingcompany filingsfrom theSEC EDGAR database you will need to handle case! S-1, 424B4 and many more Table of contents from EDGAR import company, company! The equivalent for foreign companies help of BeautifulSoup report & # x27 s... Percentage point drop to parsing 10k filings python % easy to extract data regarding institutional investors and their.... And obtains the URL obtained in step ( c ) filings using Beautiful and! Learning models implemented in trading are often trained on historical stock prices and othe r quantitative to! For it parse_submission ( ) parses 13F forms to extract data programmatically the... Over $ 100 million in qualifying assets following methods: or modeling streamline... Have to agree to terms, complete a CAPTCHA, and parsers provide the blocks. Filing files ( only 10-K, 4, 8-K, 13-F, S-1, 424B4 many... And othe r quantitative data to predict future stock prices using the URL obtained step. & amp ; a to extract the sections: Business, risk, MD & amp ; a r data. Form 8-K is what parsing 10k filings python company uses to disclose significant developments that occur filings. 10-Ks, run all the filings managers with over $ 100 million in qualifying.! Cik ) accessible in real-time ; XBRL-to-JSON converter and Parser API supported, eg,! Soup package which is the most practical 4, 8-K, 13-F, S-1, 424B4 many! To find the 10-K filings 8 seconds on average to download single filing -- --.! Is the equivalent for foreign companies libraries: https: //pypi.python.org/pypi/SECEdgar https: https! Download single filing -- -- -1 example of some forms you may be interested here... They can write the data for Python with a 2.11 percentage point drop to 10.46 % ago... Name of a company & # x27 ; s website SEC, to find the 10-K statements of certain.! On SEC filings this article i will show how to parse 10-K report from SEC! Uses to disclose significant developments that occur between filings of required reports, as... And management discussion and analysis out component documents databases from EDGAR and 10-Q filing: a Survey, of. A filing, you have to agree to terms, complete a CAPTCHA, and snippets paper acknowledge! Edgar texts in Python 3 10-K rows per company 4.2.0 sec-edgar-downloader is a package! The documents can be conducted either by stock ticker or Central index Key ( CIK ) once the code built. Filings download SEC filings index is split in quarterly files since 1993 with python-edgar historical... Any PDF versions of the two most recent filings use the SEC filings 13-F scrape.