pandas is a Python package that provides a fast, flexible, and expressive data structure designed to make working with “relational” or “labeled” data easy and intuitive.

Limitation of NumPy

for performance, NumPy arrays were significantly faster but NumPy is missing features to enable data analysis on relational data(data that are related to one another). A few of the features missed in NumPy are :

  1. NO ways to attach labels to data.
  2. NO pre-built methods to fill missing values.
  3. NO ways to group data.
  4. NO ways to pivot data.

Why Pandas ?

pandas are+ built on top of NumPy to make data processing on relational data easier.

Ingesting(the process of obtaining and importing data), Storing, pre-processing, Summarising, And visualizing data can all be done effectively in pandas.

Size mutability: columns can be inserted and deleted from DataFrame.

pandas make it easy for reshaping, merging, and joining data sets.

import pandas as pd

creating a series (Series is 1D labeled homogeneously- typed array)

indexing and selection of data (preferred loc and iloc should be used)

reading the file

dealing with null (NaN) values

operations in dataframe

Thank you for reading!!😀😀

The price of “anything” is the amount of “time”, U xchange for it. Education | Technology | Data Science | Statistics | History