**pandas** is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, **real world** data analysis in Python. Additionally, it has the broader goal of becoming **the most powerful and flexible open source data analysis / manipulation tool available in any language**. It is already well on its way toward this goal.

## Different kinds of data

pandas is well suited for these different kinds of data

- Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet
- Ordered and unordered (not necessarily fixed-frequency) time series data.
- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels
- Any other form of observational / statistical data sets. The data actually need not be labeled at all to be placed into a pandas data structure

## Features of pandas

Here are just a few of the things that pandas does well:

- Easy handling of
**missing data**(represented as NaN) in floating point as well as non-floating point data - Size mutability: columns can be
**inserted and deleted**from DataFrame and higher dimensional objects - Automatic and explicit
**data alignment**: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations - Powerful, flexible
**group by**functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data - Make it
**easy to convert**ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects - Intelligent label-based
**slicing**,**fancy indexing**, and**subsetting**of large data sets - Intuitive
**merging**and**joining**data sets - Flexible
**reshaping**and pivoting of data sets **Hierarchical**labeling of axes (possible to have multiple labels per tick)- Robust IO tools for loading data from
**flat files**(CSV and delimited), Excel files, databases, and saving / loading data from the ultrafast**HDF5 format** **Time series**-specific functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging, etc.

## Getting Started

Let us install and see how we can use pandas as part of the program

- Installation –
`pip install pandas`

- Importing –
`import pandas as pd`

- Then we can create DataFrames
- DataFrames can be used either using native operations or higher level interfaces such as SQL (using pandasql)

## Start the discussion at discuss.itversity.com