Python in Excel: A Game Changer for Financial Data
- Michael Hanson, CTO at FirstRate Data
- 24.08.2024 02:30 pm #data
Microsoft's integration of Python into Excel, slated for release in Q3 2024, is a major advance for financial data professionals using Excel as their core analysis tool. This powerful combination leverages Excel's widespread install base and well-understood UI with Python's robust analytical capabilities. Here’s a detailed look at the enhancements this integration brings to Excel.
Consuming Data
Most data vendors expose their data via API feeds which are easily integrated into programming scripts or custom tools, however, this has been a long-standing weakness for Excel which has had difficulty interacting with data APIs.
Michael Hanson, CTO at FirstRate Data noted that “integrating our data streams in Excel are the by far the common support and feature requests we have”.
Python’s inbuilt HTTP libraries which can efficiently interact with APIs will make consuming data feeds a far simpler and consistent experience in Excel.
Integrated Pandas
The ‘Pandas’ Python library has become the industry standard data processing library. As this was originally an in-house project in a hedge fund, it was specially designed for handling financial time-series data with features such as timeframe resampling, changing data timezones and complex data joins.
Speed
Excel can be very slow and memory inefficient for processing large amounts of data as it typically runs sequentially with the data processed separately and in sequence for each cell. Python by contrast allows for vectorized operations which operate on entire columns or datasets at once.
Data Joining / Filtering
Excel is primarily built for mathematical operations and its non-mathematical functions (such as lookup, vlookup etc) are not very powerful and are typically difficult to implement. Python libraries such as Pandas or the native Python functions by themselves address gap by providing advanced and easy-to-implement data search, filtering and merging functions.
Data Cleaning
One of the most mundane but time-consuming issues in data analysis is cleaning and harmonizing the data. Unfortunately, Excel is a poor fit for this use-case as data cleaning requires multiple layers of complex search and replace operations and Excel only provides a single search/replace function with no complex filters.
Python, by contrast, is ideally suited to data cleaning with several libraries such as ‘datacleaner’ built for this explicit purpose.
Data Visualization
With libraries like Matplotlib and Seaborn, Python enables the creation of complex and highly customizable charts and graphs. These visualizations can provide deeper insights and clearer presentations of financial data.
Predictive Modeling
Python’s machine learning libraries, such as scikit-learn and TensorFlow, could be used directly within Excel to develop predictive models which was previously unavailable within Excel.