Live Project: Dogecoin Price Prediction with ML

Project Info:

This project focuses on predicting the future closing price of Dogecoin (DOGE) using historical cryptocurrency market data & advanced time series forecasting techniques. The goal is to analyze price trends & correlations & then build a model using the SARIMAX (Seasonal ARIMA with exogenous variables) algorithm for short-term price prediction. The project helps understand how external market indicators can influence closing price predictions in the crypto domain.

It showcases real-world applications of time series modeling, feature engineering & visualization for financial forecasting.

Project Implementation:

Imported libraries like Pandas, NumPy, Matplotlib, Seaborn & SARIMAX from statsmodels
Loaded DOGE historical price data from a CSV file
Converted the Date column into datetime format & set it as the index
Cleaned the dataset by removing null values
Performed correlation analysis to identify key influencing factors
Engineered new features such as price gap, high/low ratio & volume-based metrics
Selected relevant features based on correlation with the closing price
Visualized the closing price trend over time
Split the data into training & testing sets (last 30 days)
Built a SARIMAX model with Close as the dependent variable & other engineered features as exogenous variables
Generated predictions & visualized them against the actual closing prices

Key Learnings & Outcomes:

Learned how to prepare time series data for forecasting
Understood correlation-driven feature selection in financial datasets
Gained hands-on experience in SARIMAX model building & interpretation
Visualized predictions vs. actuals for evaluation
Explored real-world application of time series forecasting in cryptocurrency analysis

Importing Libraries

The analysis will be done using the following libraries :

Pandas: This library helps to load the data frame in a 2D array format and has multiple functions to perform analysis tasks in one go.
Numpy: Numpy arrays are very fast and can perform large computations in a very short time.
Matplotlib / Seaborn: This library is used to draw visualizations.


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.ensemble import RandomForestRegressor

Now let us load the dataset in the panda’s data frame. One can download the CSV file from here.


data = pd.read_csv("DOGE-USD.csv")
data.head()

Now, let’s check the correlation


data.corr(numeric_only=True)

# This code is modified by Susobhan Akhuli

Converting the string date & time in proper date & time format with the help of pandas. After that check is there any null value is present or not.


data['Date'] = pd.to_datetime(data['Date'],
                              infer_datetime_format=True)
data.set_index('Date', inplace=True)

data.isnull().any()

Dropping those missing values so that we do not have any errors while analyzing.


data = data.dropna()

Changing the START_DATE and END_DATE to the date_time format so that further it can be use to do analysis.


dataset['START_DATE'] = pd.to_datetime(dataset['START_DATE'], 
                                       errors='coerce')
dataset['END_DATE'] = pd.to_datetime(dataset['END_DATE'], 
                                     errors='coerce')

Now, check the statistical analysis of the data using describe() method.


data.describe()

Now, firstly we will analyze the closing price as we need it to perform the prediction.


plt.figure(figsize=(20, 7))
x = data.groupby('Date')['Close'].mean()
x.plot(linewidth=2.5, color='b')
plt.xlabel('Date')
plt.ylabel('Volume')
plt.title("Date vs Close of 2021")

The column ‘Close’ is our predicted feature. We are taking different factors from the predefined factors for our own calculation and naming them suitably. Also, we are checking each factor while correlating with the ‘Close’ column while sorting it in descending order.


data["gap"] = (data["High"] - data["Low"]) * data["Volume"]
data["y"] = data["High"] / data["Volume"]
data["z"] = data["Low"] / data["Volume"]
data["a"] = data["High"] / data["Low"]
data["b"] = (data["High"] / data["Low"]) * data["Volume"]
abs(data.corr()["Close"].sort_values(ascending=False))

By, observing the correlating factors, we can choose a few of them. We are excluding High, Low, and Open as they are highly correlated from the beginning.


data = data[["Close", "Volume", "gap", "a", "b"]]
data.head()

Introducing the ARIMA model for Time Series Analysis. ARIMA stands for autoregressive integrated moving average model and is specified by three order parameters: (p, d, q) where AR stands for Autoregression i.e. p, I stands for Integration i.e. d, MA stands for Moving Average i.e. q. Whereas, SARIMAX is Seasonal ARIMA with exogenous variables.


df2 = data.tail(30)
train = df2[:11]
test = df2[-19:]

print(train.shape, test.shape)

Model Development


from statsmodels.tsa.statespace.sarimax import SARIMAX
model = SARIMAX(endog=train["Close"], exog=train.drop(
    "Close", axis=1), order=(2, 1, 1))
results = model.fit()
print(results.summary())


start = 11
end = 29
predictions = results.predict(
    start=start,
    end=end,
    exog=test.drop("Close", axis=1))
predictions

Finally, plot the prediction to get a visualization.


test["Close"].plot(legend=True, figsize=(12, 6))
predictions.plot(label='TimeSeries', legend=True)

Notebook link : click here.

Dataset Link: click here

Live Project: Dogecoin Price Prediction with ML

Project Info:

Project Implementation:

Key Learnings & Outcomes:

Importing Libraries

Model Development

Quick Links

Home

Contact

Blogs

FAQs

News

Placements

Interview Questions

Data Science Projects

Courses

Data Science

Data Analytics

Power BI

Machine Learning

Advance AI

Full Stack Python

Full Stack Java

MERN Stack

Address

Prime Point AI, Data Science Course, Data Analytics Training

Office No. 7, First Floor, Quantum Works Awfis Building, Near Nal Stop, Metro Station, Erandwane, Pune, Maharashtra - 411004

Contact Details

+91 8446273688

info@primepointinstitute.com

Live Project: Dogecoin Price Prediction with ML

Project Info:

Project Implementation:

Key Learnings & Outcomes:

Importing Libraries

Model Development

Quick Links

Courses

Address

Contact Details

Request Callback