Python How To Create Empty Dataframe
I will explain how to create an empty DataFrame in pandas with or without column names (column names) and Indices. Below I have explained one of the many scenarios where you would need to create an empty DataFrame.
While working with files, sometimes we may not receive a file for processing, however, we still need to create a DataFrame manually with the same column names we expect. If we don't create with the same column names, our operations/transformations (like unions) on DataFrame fail as we refer to the columns that may not be present.
To handle situations similar to these, we always need to create a DataFrame with the same schema, which means the same column names and datatypes regardless of the file exists or empty file processing.
Note: DataFrame contains rows with all NaN values not considered as empty. To consider DF empty it needs to have shape(0, n). shape (n,0) is not considered empty as it has n rows.
1. Quick Examples of Creating Empty DataFrame in pandas
If you are in a hurry, below are some quick examples of how to create an empty DataFrame in pandas.
# Below are quick example # create empty DataFrame using constucor df = pd.DataFrame() # Creating Empty DataFrame with Column Names df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"]) # Create DataFrame with index and columns # Note this is not considered empty DataFrame df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1']) # Add rows to empty DataFrame df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True) # Check if DataFrame empty print("Empty DataFrame :"+ str(df.empty)) To understand in detail, follow reading the article.
2. Create Empty DataFrame Using Constructor
One simple way to create an empty pandas DataFrame is by using its constructor. The below example creates a DataFrame with zero rows and columns (empty).
# create empty DataFrame using constucor df = pd.DataFrame() print(df) print("Empty DataFrame : "+str(df1.empty)) Yields below output. Notice that the columns and Index have no values.
3. Creating Empty DataFrame with Column Names
The column labels also can be added while creating an empty DataFrame. In this case, DataFrame contains only columns but not rows/Indexes. To do this, will use DataFrame constructor with columns param. columns param accepts a list of column labels.
# Creating Empty DataFrame with Column Names df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"]) print(df) print("Empty DataFrame : "+str(df.empty)) Yields below output.
Empty DataFrame Columns: [Courses, Fee, Duration, Discount] Index: [] Empty DataFrame : True All columns on the above DataFrame have type object, you can change it by assigning a custom data type.
#Create empty DataFrame with specific column types df = pd.DataFrame({'Courses': pd.Series(dtype='str'), 'Fee': pd.Series(dtype='int'), 'Duration': pd.Series(dtype='str'), 'Discount': pd.Series(dtype='float')}) print(df.dtypes) Yields below output
Courses object Fee int32 Duration object Discount float64 dtype: object 4. Add Columns and Index While Creating DataFrame
Let's see how to add a DataFrame with columns and rows with nan values. Note that this is not considered an empty DataFrame as it has rows with NaN, you can check this by calling df.empty attribute, which returns False. Use DataFrame.dropna() to drop all NaN values. To add index/row, will use index param, along with columns param for column labels.
#Add columns and index while creating empty DataFrame df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=['index1']) print(df) print("Empty DataFrame : "+str(df.empty)) Yields below output. Note that, this is not an empty DataFrame as it has rows with NaN values.
Courses Fee Duration Discount index1 NaN NaN NaN NaN Empty DataFrame : False 5. Check if DataFrame is Empty
DataFrame.empty property is used to check if a DataFrame is empty or not. When it is empty it returns True otherwise False. DataFrame is considered non-empty if it contains 1 or more rows. Having all rows with NaN values is still considered a non-empty DataFrame.
if df.empty: print("Empty DataFrame") else print("Non Empty DataFrame") 6. Create Empty DataFrame From Another DataFrame
You can also create a zero record DataFrame from another existing DF. This would be done to create a blank DataFrame with the same columns as the existing but without rows.
# create empty DataFrame from another DataFrame columns_list = df.columns df2 = pd.DataFrame(columns = columns_list) print(df2) Yields below output.
Empty DataFrame Columns: [Courses, Fee, Duration, Discount] Index: [] 7. Add Rows to Empty DataFrame
DataFrame.append() method is used to append/add rows to empty DataFrame. Use append() if you wanted to add few rows as it has a performance issue. To add hundreds or thousands of rows to a DataFrame, use a constructor with data in a list collection.
# Add rows to empty DataFrame df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"]) df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True) print(df2) Yields below output.
Courses Fee Duration Discount 0 Spark 20000 30days 1000 To add more rows use a constructor.
# Collect rows into list. data = [] db_data=get_data() for Courses, Fee, Duration, Discount in db_data: data.append([Courses, Fee, Duration, Discount]) # Fill DataFrame with rows. df = pd.DataFrame(data, columns=["Courses", "Fee", "Duration","Discount"]) 8. Add Rows From Another DataFrame
If you have an empty data frame and fill it with data from one or multiple DataFrame's, you can do this as below
#creates a new empty DataFrame df = pd.DataFrame() df = df.append(df2, ignore_index = True) df = df.append(df3, ignore_index = True) 9. Complete Example of Create Empty DataFrame in pandas
import pandas as pd technologies = { 'Courses':["Spark","PySpark","Python","pandas"], 'Fee' :[20000,25000,22000,30000], 'Duration':['30days','40days','35days','50days'], 'Discount':[1000,2300,1200,2000] } index_labels=['r1','r2','r3','r4'] df = pd.DataFrame(technologies,index=index_labels) print(df) # create empty DataFrame using constucor df2 = pd.DataFrame() print(df2) # Add column names/labels to empty DataFrame df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"]) print(df2) #Add columns and index while creating empty DataFrame index_labels=['index1'] df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"],index=index_labels) df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True) print(df2) # create empty DataFrame from another DataFrame columns_list = df.columns df2 = pd.DataFrame(columns = columns_list) print(df2) # Add rows to empty DataFrame df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"]) df2 = df.append({"Courses":"Spark","Fee":20000,"Duration":'30days',"Discount":1000},ignore_index = True) print(df2) Conclusion
In this article, you have learned how to create a DataFrame with zero rows, with or without columns, add rows to the DataFrame, and many more with examples.
Happy Learning !!
You May Also Like
- Create Pandas DataFrame With Working Examples
- How to Get Column Average or Mean in Pandas DataFrame
- Retrieve Number of Columns From Pandas DataFrame
- Pandas Drop First/Last N Columns From DataFrame
- Pandas Drop First N Rows From DataFrame
References
- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.empty.html
Python How To Create Empty Dataframe
Source: https://sparkbyexamples.com/pandas/pandas-create-empty-dataframe/
Posted by: mcculloughglelavold.blogspot.com

0 Response to "Python How To Create Empty Dataframe"
Post a Comment