본문 바로가기

Python/▶ Python & Pandas

TIL ①-2일차

728x90

Pandas - FinanceDataReadr 라이브러리 통해 금융데이터 수집

 

 

1) 설치 및 라이브러리 불러오기

 

더보기
#1. 주석을 풀고 설치해 주세요. 주석을 푸는 방법은 아래 코드의 맨 앞에 있는 #을 지워주시면 됩니다.
!pip install -U finance-datareader
---------------------------------------------------------------------------------------
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting finance-datareader
  Downloading finance_datareader-0.9.50-py3-none-any.whl (19 kB)
Requirement already satisfied: lxml in /usr/local/lib/python3.8/dist-packages (from finance-datareader) (4.9.2)
Requirement already satisfied: requests>=2.3.0 in /usr/local/lib/python3.8/dist-packages (from finance-datareader) (2.25.1)
Requirement already satisfied: pandas>=0.19.2 in /usr/local/lib/python3.8/dist-packages (from finance-datareader) (1.3.5)
Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from finance-datareader) (4.64.1)
Collecting requests-file
  Downloading requests_file-1.5.1-py2.py3-none-any.whl (3.7 kB)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.8/dist-packages (from pandas>=0.19.2->finance-datareader) (1.21.6)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas>=0.19.2->finance-datareader) (2022.7)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas>=0.19.2->finance-datareader) (2.8.2)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests>=2.3.0->finance-datareader) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests>=2.3.0->finance-datareader) (2022.12.7)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests>=2.3.0->finance-datareader) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests>=2.3.0->finance-datareader) (4.0.0)
Requirement already satisfied: six in /usr/local/lib/python3.8/dist-packages (from requests-file->finance-datareader) (1.15.0)
Installing collected packages: requests-file, finance-datareader
Successfully installed finance-datareader-0.9.50 requests-file-1.5.1

 

#2. 데이터 분석을 위해라이브러리 불러오기 

import pandas as pd
 
 
#3.  FinanceDataReader 를 fdr 별칭으로 불러옵니다.
#4. 라이브러리의 version을 확인하고 싶을 때는 .__version__ 으로 확인합니다. 
 
import FinanceDataReader as fdr

fdr.__version__
-------------------------------
'0.9.50'
 

 

2) 한국거래소 상장종목 전체 가져오기

# 도움말을 보고자 할때는 ? 를 사용하고 소스코드를 볼 때는 ??를 사용합니다.

# 주피터 노트북에서는 함수나 메소드의 괄호 안에서 shift + tab 키를 누르면 도움말을 볼 수 있습니다.
더보기
# KRX : KRX 종목 전체
# KOSPI : KOSPI 종목
# KOSDAQ : KOSDAQ 종목
# KONEX : KONEX 종목
# NASDAQ : 나스닥 종목
# NYSE : 뉴욕증권거래소 종목
# SP500 : S&P500 종목
 
url = 'http://kind.krx.co.kr/corpgeneral/corpList.do?method=download&searchType=13'
df_listing = pd.read_html(url, header=0, flavor='bs4', encoding='EUC-KR')[0]
cols_ren = {'회사명':'Name', '종목코드':'Symbol', '업종':'Sector', '주요제품':'Industry', 
                    '상장일':'ListingDate', '결산월':'SettleMonth',  '대표자명':'Representative', 
                    '홈페이지':'HomePage', '지역':'Region', }
df_listing = df_listing.rename(columns = cols_ren)
df_listing
-----------------------------------------------------------------------------------------------

 

# 한국거래소 상장종목 전체 가져오기
 
df = fdr.StockListing('KRX')
df
----------------------------

 

# 행과 열의 크기를 봅니다.(행, 열) 순
 
df.shape
-----------------
(2690, 17)
 
 
# 전체 데이터프레임의 요약정보를 봅니다.
df.info()
-------------------------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2690 entries, 0 to 2689
Data columns (total 17 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Code         2690 non-null   object 
 1   ISU_CD       2690 non-null   object 
 2   Name         2690 non-null   object 
 3   Market       2690 non-null   object 
 4   Dept         2690 non-null   object 
 5   Close        2690 non-null   object 
 6   ChangeCode   2690 non-null   object 
 7   Changes      2690 non-null   int64  
 8   ChagesRatio  2690 non-null   float64
 9   Open         2690 non-null   int64  
 10  High         2690 non-null   int64  
 11  Low          2690 non-null   int64  
 12  Volume       2690 non-null   int64  
 13  Amount       2690 non-null   int64  
 14  Marcap       2690 non-null   int64  
 15  Stocks       2690 non-null   int64  
 16  MarketId     2690 non-null   object 
dtypes: float64(1), int64(8), object(8)
memory usage: 357.4+ KB
 
 
# 기술통계 값을 요약합니다.
# number, object 기술통계 값을 구하고 공유해 주세요
 
df.describe(include = 'object') #수치 데이터에 대한 기술 통계 !!!
# include, exclude 등의 옵션에 따라 오브젝트 타입의 기술 통계 구할 수 있음
# describe()는 시리즈와 데이터프레임 둘 다 사용가능

#df.describe(include = 'O') , df.describe(include = ['number', 'object'])
-----------------------------------------------------------------------------

3) 파일로 저장하고 불러오기

더보기
# head 로 미리보기
# self.iloc[:n]
df.head()

# import pandas as pd

1. 데이터 저장하기

df.to_csv('finance_data.csv', index = False )

 

 

2. 데이터 불러오기

pd.read_csv('finance_data.csv')
728x90

'Python > ▶ Python & Pandas' 카테고리의 다른 글

5주차 plotly  (1) 2023.02.05
5주차 TIL ( Pandas : 시각화 )  (0) 2023.01.31
4주차 WIL  (1) 2023.01.19
TIL ②일차  (0) 2023.01.11
TIL ①일차  (0) 2023.01.09