Pandas.DataFrame
Basic logic
DataFrame is like a dict, but each item is of the same length. And each element in a dict has a index.
Let’s look at a Series
object first.
s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
'''
a 0.4691
b -0.2829
c -1.5091
d -1.1356
e 1.2121
dtype: float64
'''
s.index
# Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
A simple DataFrame
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
'''
A B C D
2013-01-01 0.469112 -0.282863 -1.509059 -1.135632
2013-01-02 1.212112 -0.173215 0.119209 -1.044236
2013-01-03 -0.861849 -2.104569 -0.494929 1.071804
2013-01-04 0.721555 -0.706771 -1.039575 0.271860
2013-01-05 -0.424972 0.567020 0.276232 -1.087401
2013-01-06 -0.673690 0.113648 -1.478427 0.524988
'''
DataFrame
from dict
:
df2 = pd.DataFrame({ 'A' : 1.,
'B' : pd.Timestamp('20130102'),
'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
'D' : np.array([3] * 4,dtype='int32'),
'E' : pd.Categorical(["test","train","test","train"]),
'F' : 'foo' })
'''
A B C D E F
0 1 2013-01-02 1 3 test foo
1 1 2013-01-02 1 3 train foo
2 1 2013-01-02 1 3 test foo
3 1 2013-01-02 1 3 train foo
'''
df2.dtypes
'''
A float64
B datetime64[ns]
C float32
D int32
E category
F object
dtype: object
'''
References
For more about:
- Viewing data
- selection elements by label/position
See