Member-only story

Tabular Data: Data Wrangling with Pandas

btd
3 min readNov 22, 2023

--

Photo by shiyang xu on Unsplash

In data science, a DataFrame is a two-dimensional, tabular data structure that is commonly used in programming languages like Python (Pandas library), R, and Julia. It is similar to a spreadsheet or SQL table, where data is organized in rows and columns. DataFrames are particularly popular for handling and manipulating structured data.

Let’s focus on the use of DataFrames in Python using the Pandas library, as it is widely used in the data science community.

Pandas DataFrame:

1. Creating a DataFrame:

You can create a DataFrame using various methods, such as from a dictionary, a list of lists, a NumPy array, or by reading data from an external source (CSV, Excel, SQL, etc.).

import pandas as pd

# From a dictionary
data = {'Name': ['John', 'Jane', 'Bob'],
'Age': [28, 24, 22],
'City': ['New York', 'San Francisco', 'Los Angeles']}

df = pd.DataFrame(data)

2. Viewing Data:

  • head(): Displays the first few rows of the DataFrame.
  • tail(): Displays the last few rows of the DataFrame.
print(df.head())
print(df.tail())

3. Accessing and Manipulating Data:

--

--

btd
btd

No responses yet