2. Importing data in colab notebook#
There are several ways to import time series data into Google Colab, depending on the format and location of the data. Here are some common methods:
2.1 Importing data from local storage#
You can upload time series data directly from your local machine to Google Colab. To do this, click on the file icon on the left sidebar of the Colab interface, then click “Upload” and select the file(s) containing your time series data.
from google.colab import files
# Upload the file
uploaded = files.upload()
# Read the uploaded file
for filename in uploaded.keys():
print('Uploaded file:', filename)
with open(filename, 'r') as file:
data = file.read()
# Process the data as needed
2.2 Importing data from Google Drive#
If your time series data is stored in Google Drive, you can mount your Google Drive in Colab to access the data.Use the following code snippet to mount Google Drive:
from google.colab import drive
drive.mount('/content/drive')
You will be prompted to authenticate and provide an authorization code. After successful authentication, you can access your Google Drive files and load your time series data.
from google.colab import drive
# Mount Google Drive
drive.mount('/content/drive')
# Read a file from Google Drive
file_path = '/content/drive/MyDrive/book writing/AirPassengers.csv'
with open(file_path, 'r') as file:
data = file.read()
# Process the data as needed
print(data[:101])
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Month,#Passengers
1949-01,112
1949-02,118
1949-03,132
1949-04,129
1949-05,121
1949-06,135
1949-07,148
2.3 Importing data from a remote URL#
If your time series data is available as a direct download link, you can use the wget
or curl
command in Colab to download the data.
import requests
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
response = requests.get(url)
data = response.text
print(data[:96])
# Process the data as needed
"Month","Passengers"
"1949-01",112
"1949-02",118
"1949-03",132
"1949-04",129
"1949-05",121
2.4 Importing data from a GitHub repository#
import requests
url = 'https://raw.githubusercontent.com/ejgao/Time-Series-Datasets/master/Electric_Production.csv'
response = requests.get(url)
data = response.text
print(data[:100])
# Process the data as needed
DATE,IPG2211A2N
1/1/1985,72.5052
2/1/1985,70.672
3/1/1985,62.4502
4/1/1985,57.4714
5/1/1985,55.3151
2.5 Google Cloud Storage (GCS)#
If your data is stored in Google Cloud Storage
, you can use the google-cloud-storage
library to access it. First, you’ll need to authenticate your Google Cloud account:
from google.colab import auth
auth.authenticate_user()
Then, you can use the gsutil
command to copy data from GCS to your Colab environment.
These are some common methods to import time series data into Google Colab, depending on the data’s location and format. Choose the method that suits your specific use case and data source.
Explorering the sheets with the same name as excel file
excelfile = pd.ExcelFile(filepath)
excelfile.sheet_names
['2017', '2018']
excelfile.parse('2017')[:5]
Line_Item_ID | Date | Credit_Card_Number | Quantity | Menu_Item | |
---|---|---|---|---|---|
0 | 1 | 2017-01-01 | 7437926611570790 | 1 | spicy miso ramen |
1 | 2 | 2017-01-01 | 7437926611570790 | 1 | spicy miso ramen |
2 | 3 | 2017-01-01 | 8421920068932810 | 3 | tori paitan ramen |
3 | 4 | 2017-01-01 | 8421920068932810 | 3 | tori paitan ramen |
4 | 5 | 2017-01-01 | 4787310681569640 | 1 | truffle butter ramen |
excelfile.parse('2018')[:5]
Line_Item_ID | Date | Credit_Card_Number | Quantity | Menu_Item | |
---|---|---|---|---|---|
0 | 36765 | 2018-01-01 | 4178504356986060 | 1 | burnt garlic tonkotsu ramen |
1 | 36766 | 2018-01-01 | 4178504356986060 | 1 | burnt garlic tonkotsu ramen |
2 | 36767 | 2018-01-01 | 9385429783634070 | 2 | vegetarian spicy miso |
3 | 36768 | 2018-01-01 | 9385429783634070 | 2 | vegetarian spicy miso |
4 | 36769 | 2018-01-01 | 2528867357453560 | 1 | tori paitan ramen |