Sklearn, Statsmodels, Tensorflow… Neural nets, regressions. Time series predictions all at once and really simple…

There are many sophisticated algorithms, libraries and ideas for making predictions, but it takes it’s time to try them all and compare. Therefor we have** predictit**.

Nowadays one can just find appropriate method, import it and it’s done. But in days of abundance even if we do not have to develop own models, it takes a lot of time to compare all the possible solutions. We have many libraries with machine learning models, many libraries for data analysis and many for data preprocessing. But we need to join all the fragments by ourselves. There is a library/framework for such a tasks. It’s called **predictit**. It’s open-source with source here and with the documentation.

Compare more than 20 models from Sklearn, Statsmodels, Tensorflow and more just in couple of minutes and also find the optimal input parameters? Yes! How to do that? After

`pip install predictit`

All you need is

`import predictit`

predictions = predictit.main.predict()

That’s it. Everything works (on generated test data of course). Just input data that you want to predict and it’s done… You can do config in three ways: Input parameters in *predict* function or you can use command line arguments or you can edit *config.py *values. Data sources can be in csv, dataframe, numpy array or SQL. Possible outputs are predictions in numpy array data format or interactive plot. If you’re not using python, use command line arguments like below and run in terminal in folder. Use main.py –help for more parameters info.

`python main.py --function predict --data_source 'csv' --csv_path 'test_data/daily-minimum-temperatures.csv' --predicted_column 1`

Results can look like this.

After mouse over, you can see exact values, models names and the rank of models by error criterion. It can operate with datetime, data can be resampled and predicted in given frequency.

What models are used? For example…

- AR (autoregressive model)
- ARIMA
- Autoregressive Linear neural unit
- Conjugate gradient
- Bayes Ridge Regression
- Extreme learning machine

Software can be used as python library or as standalone framework that you can edit in any fancy way. Check official readme and tests for some use cases. Read all *config.py* file for what all you can do.

If you want to predict like a pro, you can start here…

`import predictit`

predictit.config.predicts = 12 # Create 12 predictions

predictit.config.data_source = 'csv' # Define that we load data from CSV

predictit.config.csv_adress = r'E:\VSCODE\Diplomka\test_data\daily-minimum-temperatures.csv' # Load CSV file with data

predictit.config.save_plot_adress = r'C:\Users\TruTonton\Documents\GitHub' # Where to save HTML plot

predictit.config.datalength = 1000 # Consider only last 1000 data points

predictit.config.predicted_columns_names = 'Temp' # Column name that we want to predict

predictit.config.optimizeit = 0 # Find or not best parameters for models

predictit.config.compareit = 6 # Visualize 6 best models

predictit.config.repeatit = 4 # Repeat calculation 4x times on shifted data to reduce chance

predictit.config.other_columns = 0 # Whether use other columns or not

# Chose models that will be computed

used_models = {

"AR (Autoregression)": predictit.models.ar,

"ARIMA (Autoregression integrated moving average)": predictit.models.arima,

"Autoregressive Linear neural unit": predictit.models.autoreg_LNU,

"Conjugate gradient": predictit.models.cg,

"Extreme learning machine": predictit.models.regression,

"Sklearn regression": predictit.models.regression,

}

# Define parameters of models

n_steps_in = 50 # How many lagged values in models

output_shape = 'batch' # Whether batch or one-step models

models_parameters = {

"AR (Autoregression)": {"plot": 0, 'method': 'cmle', 'ic': 'aic', 'trend': 'nc', 'solver': 'lbfgs'},

"ARIMA (Autoregression integrated moving average)": {"p": 12, "d": 0, "q": 1, "plot": 0, 'method': 'css', 'ic': 'aic', 'trend': 'nc', 'solver': 'nm', 'forecast_type': 'out_of_sample'},

"Autoregressive Linear neural unit": {"plot": 0, "lags": n_steps_in, "mi": 1, "minormit": 0, "tlumenimi": 1},

"Conjugate gradient": {"n_steps_in": 30, "epochs": 5, "constant": 1, "other_columns_lenght": None, "constant": None},

"Extreme learning machine": {"n_steps_in": 20, "output_shape": 'one_step', "other_columns_lenght": None, "constant": None, "n_hidden": 20, "alpha": 0.3, "rbf_width": 0, "activation_func": 'selu'},

"Sklearn regression": {"regressor": 'linear', "n_steps_in": n_steps_in, "output_shape": output_shape, "other_columns_lenght": None, "constant": None, "alpha": 0.0001, "n_iter": 100, "epsilon": 1.35, "alphas": [0.1, 0.5, 1], "gcv_mode": 'auto', "solver": 'auto'}

}

predictions = predictit.main.predict()

Except the plot and results, also table of models errors is printed. It can look like this.

How the framework works? It’s kind of soft brute force. Result is matrix of predictions that are evaluated with some error criterion. It’s evaluated on more data lengths, and repeated on translated data for accidental success removal. Final n-dimensional matrix is analyzed and the best models are selected with appropriate data lengths and data preprocessing.

You can choose how to standardize data, error criterion or various initial model’s arguments. You can use *config.optimize* to find the best arguments for given models if you set up arguments limits. It’s based on dividing into intervals, finding best interval and dividing again. It can operate not only with integers and floats, but also with list on strings. It will create all various combinations and find the best one.

If you use for example Sklearn regression model, there is a parameter regressor. There is also function for parsing all the regressors from Sklearn. If you use optimization then, it will find the best suited regression for your data. No need to learn about various algorithms like lasso, or passive aggressive algorithms, just use it.

If you want to predict more columns, use predict_multiple function. If you need to care about performance, define own test data, run compare_models function, choose only best few models and setup *config.lengths=0* and *config.repeatit = 1*. If you want to see all the results and all the errors, just do *config.debug = 1*.

If you like it and think it’s useful, just fork it on github. No donations allowed.