Getting Started with Time Series Data Science

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Mar 15, 2021

Navigate to:

Are you interested in performing time series forecasting or anomaly detection, but you don’t know where to start? If so, you’re not alone. There is an overwhelming variety of libraries, algorithms, and workflow recommendations for these tasks. As a Developer Advocate at InfluxDB, the leading time series database, I’ve researched time series data science methodologies and best practices for forecasting and anomaly detection. Today I want to summarize some important concepts about time series as well as share resources to get you started on your time series data science journey.

Why should a beginner interested in data science start learning about time series?

If you’re interested in becoming a data scientist, learning about data science as it pertains to time series is a great place to start. Time series data is data that is indexed chronologically. Because it’s indexed in time, often times, each time series data point is related to what came before. To explain what I mean, let’s take a look at weather data. The temperature of the city you live in right now is correlated to the temperature an hour ago and even last week or the same time last year. In other words, the temperature data is correlated with itself at other points in time. This statistical phenomenon is called autocorrelation, and it is one of the reasons that time series data is unique in the data world.

As a result, several data science algorithms that work for other types of data don’t work for time series data as well. This is because several advanced prediction and anomaly detection algorithms, or neural networks, rely on the assumption that your dataset doesn’t exhibit certain statistical attributes common to time series data, like autocorrelation.

You can still use neural networks on time series data that contains attributes that violate the assumption of the network, but you have to eliminate those attributes first. For example, you can remove autocorrelation from your time series through differencing, but this type of data pre-processing data can be tricky. Luckily, statistical algorithms are generally easier to understand than neural nets. Statistical methods are frequently excellent predictors and good at identifying anomalies. These two factors make learning about time series an excellent place for beginners to start their data science journey.

Recommended tools for a beginner looking to learn about time series data science

The first step in performing forecasting or anomaly detection is to learn about various algorithms and methods that exist to help you achieve your goal. Always make sure to research the underlying statistical assumptions of the algorithm you choose, and verify whether or not your data violates those assumptions. I always look towards Jupyter Notebooks to help me perform preliminary algorithm selection research. Using Jupyter Notebooks offers me the opportunity to try out algorithms on sample data sets to better understand various Python libraries and their time series algorithms. Once I feel that I’ve gained an understanding of the library and algorithm that I want to employ, then I’ll test the performance of that algorithm on my dataset. I store all of my time series data in InfluxDB. I use a Python Client to pull certain data sets out for further analysis.

Time series data science resource for InfluxDB

While InfluxDB allows you to transform your data and even write custom functions for anomaly detection with Flux, I want to introduce you to the Notebooks repo. This repo contains a variety of Jupyter Notebooks to help you get started with InfluxDB and time series data science tasks. Within this repo you can learn how to:

Get started with Python and InfluxDB
Get started with Pandas and InfluxDB
Use the Flux interpreter for Jupyter Notebooks
Perform anomaly detection
- Multiple time series
- Single time series
Perform forecasting
- FB Prophet
- LSTMs
- Holt-Winters

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Getting Started with Time Series Data Science

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Mar 15, 2021

Navigate to:

Why should a beginner interested in data science start learning about time series?

Recommended tools for a beginner looking to learn about time series data science

Time series data science resource for InfluxDB

Further reading on time series forecasting and anomaly detection with InfluxDB

Ready to get started?

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT:
A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Get Updates

Getting Started with Time Series Data Science

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started Mar 15, 2021

Navigate to:

Why should a beginner interested in data science start learning about time series?

Recommended tools for a beginner looking to learn about time series data science

Time series data science resource for InfluxDB

Further reading on time series forecasting and anomaly detection with InfluxDB

Ready to get started?

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT: A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Sign up for the InfluxData newsletter

Follow Us

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Mar 15, 2021

InfluxDB for Industrial IoT:
A Live Demonstration