Displaying Dataset Description

tags: #python/data_science/eda

Displaying the Dataset Description

To retrieve the summary description of a DataFrame object, we can use the following method:

df.info()

This is a useful method for quickly getting an overview of the data in a DataFrame including information about the data type, number of records in each column, and number of missing values.

Sample Return Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 7 columns):
column_1      10000 non-null int64
column_2      10000 non-null object
column_3      10000 non-null float64
column_4      10000 non-null int64
column_5      10000 non-null object
column_6      10000 non-null float64
column_7      10000 non-null int64
dtypes: float64(2), int64(3), object(2)
memory usage: 547.0+ KB

Getting the Dimension of the Dataset

To get the dimension (or shape) of the dataset:

df.shape

This returns the number of features in the dataset and the records:

([# of features], [# of records])
Powered by Forestry.md