https://github.com/stoqs/stoqs/blob/master/stoqs/contrib/analysis/__init__.py , _ getMeasuredPPData
ꡬνλ ν¨μλ₯Ό νμ₯νμ¬ λ κ°μ 맀κ°λ³μκ° μ 곡λ λ μΈ‘μ λ λ°μ΄ν°λ₯Ό κ°μ Έμ΅λλ€. μ£Όμ΄μ§ νλ«νΌμ λν λͺ¨λ 맀κ°λ³μ λλ μ£Όμ΄μ§ 맀κ°λ³μ λͺ©λ‘μ κ°μ Έμ€λλ‘ μ΄ κΈ°λ₯μ νμ₯νλ©΄ μΆλ ₯ λ°μ΄ν°λ₯Ό νμνκ³ λͺ¨λΈλ§ν λ λ λ§μ λ°μ΄ν°μ κΈ°λ₯μ μ¬μ©ν μ μμ΅λλ€. μ΄λ κΈ°κ³ νμ΅ μκ³ λ¦¬μ¦μ μ±λ₯μ κ°μ νλ λ° μ€μν μ μμ΅λλ€.
λͺ©νλ μ΄ λ°μ΄ν°λ₯Ό pandas λ°μ΄ν° νλ μ λλ μ΄μ μ μ¬ν κ²μΌλ‘ κ°μ Έμμ μΆκ° κΈ°κ³ νμ΅ μκ³ λ¦¬μ¦μ ꡬνν λ μμ
νκΈ° μ¬μ΄ κΈ°λ°μ κ°λ κ²μ
λλ€.
μ μμ , @MBARIMike , @bretstine λ° @markmocek μ Fall Capstone 2018μ μΌλΆλ‘ μ΄ λ¬Έμ λ₯Ό λ μμΈν μ‘°μ¬ν κ²μ
λλ€.
μ΄κ²μ STOQS μ½λ κΈ°λ°μ μ€μν μΆκ° μ¬νμ΄ λ κ²μ λλ€!
λλ μ°λ¦¬κ° ν₯μνκ³ μνλ κ²μ μκ° createLabels()
μ κΈ°λ₯ classify.py
μμ μ½κ° λ€λ₯Έ λ°©λ²μ νΈμΆνλ νλ‘κ·Έλ¨, __init__.py
: _getPPData () . μ΄ λ©μλκ° νΈμΆλλ©΄ λ°νμμ MeasuredParameter IDλ§ μ¬μ©λ©λλ€. μ΄ λ¬Έμ μ μꡬ μ¬νμ μ¬λ°λ₯΄κ² ν΄μνλ©΄ λ€μμ κΈ°λ₯ μꡬ μ¬ν λͺ©λ‘μ
λλ€.
μ κΈ°μ΅μ΄ λ§λ€λ©΄ _getPPData()
λ UIμ©μΌλ‘ μ΄λ―Έ κ°λ°λ λ©μλλ₯Ό μ¬μ¬μ©νκ³ MeasuredParameterμ κ°μμ κ΄κ³μμ΄ pvDict
μ¬μ μ μ λ¬ν μ μλ€λ μ μμ _getMeasuredPPData()
λ³΄λ€ μΌλ°νλ κ°μ μ
λλ€. μ νμ λν κ° μ μ½.
UIμ©μΌλ‘ μ΄λ―Έ κ°λ°λ μ½λλ UIμ Parameter-Parameter μΉμ μ νλ‘ν ν μ¬λ¬ 맀κ°λ³μλ₯Ό κ²μνκΈ° μν΄ μ체 μ‘°μΈ λ¬Έμ μ€ννλ μμ SQL λ¬Έμ ꡬμ±ν©λλ€. μ΄ μ½λλ νμ₯νκΈ° μ΄λ €μΈ κ²μ λλ€. μλ§λ μ°λ¦¬λ κΈ°κ³ νμ΅ κΈ°μ μ μ¬μ©νμ¬ νμ λ° λͺ¨λΈλ§μ μ ν©ν νμμΌλ‘ λ°μ΄ν°λ₯Ό κ°μ Έμ€λ μλ‘μ΄ μ κ·Ό λ°©μμ μ·¨ν μ μμ΅λλ€.
λ€μμ μλ‘μ΄ μ κ·Ό λ°©μ, μ¦ doradoμμ μ²μ 20κ°μ λ°μ΄ν° κ°μ κ°μ Έμ€λ Django 쿼리μ μμμ λλ€.
(venv-stoqs) [vagrant<strong i="6">@localhost</strong> stoqsgit]$ stoqs/manage.py shell_plus
...
In [1]: mps = MeasuredParameter.objects.using('stoqs_september2013_o').filter(
...: measurement__instantpoint__activity__platform__name='dorado')
...:
In [2]: for i, mp in enumerate(mps[:20]):
...: if i == 0:
...: print("time, depth, latitude, longitude, parameter__name, measuredparameter__datavalue")
...: print(f"{mp.measurement.instantpoint.timevalue}, {mp.measurement.depth:.2f},"
...: f" {mp.measurement.geom.y:.6f}, {mp.measurement.geom.x:.6f}"
...: f" {mp.parameter.name}, {mp.datavalue}")
...:
time, depth, latitude, longitude, parameter__name, measuredparameter__datavalue
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 sigmat, 25.1383576072121
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 spice, 0.830712889765499
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 altitude, 1395.68956636994
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 temperature, 13.9910522171992
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 salinity, 33.6403972259011
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 oxygen, 5.670288605996
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 nitrate, 0.21
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 bbp420, 0.00231458255927606
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 bbp700, 0.00228426640768986
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 fl700_uncorr, 0.000823624706576738
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 biolume, 194666664.695293
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 roll, -4.08951048388392
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 pitch, -0.105888989907026
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 yaw, 175.513420572358
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 sepCountList, None
2013-09-17 18:42:20, -0.03, 36.734970, -122.128144 mepCountList, None
2013-09-17 18:42:18, -0.04, 36.734989, -122.128162 sigmat, 25.1403727711047
2013-09-17 18:42:18, -0.04, 36.734989, -122.128162 spice, 0.829269194464183
2013-09-17 18:42:18, -0.04, 36.734989, -122.128162 altitude, 1395.49904668803
2013-09-17 18:42:18, -0.04, 36.734989, -122.128162 temperature, 13.9828055034561
Pandasμμ λΆμν μ μλ νμμΌλ‘ λ°μ΄ν°λ₯Ό κ°μ Έμ€κΈ° μν΄ μ΄μ κ°μ μΆλ ₯μ νΌλ²νλ λ°©λ²μ΄ μμ΅λκΉ?
@MBARIMike λ νμ€ν μ°λ¦¬κ° κ°κ³ μ νλ λ°©ν₯κ³Ό λΉμ·ν©λλ€. μ΄μ©λ©΄ κΈ°μ‘΄ κΈ°λ₯μ "νμ₯"μ΄ μ°λ¦¬κ° μλ‘ μμνλ€λ μ μ κ°μν λ μλͺ»λ ννμΌ μλ μμ΅λλ€. ν΄λͺ ν΄μ£Όμ μ κ°μ¬ν©λλ€.
λν Pandasμλ Django λ°μ΄ν°λ₯Ό λ°μ΄ν° νλ μμΌλ‘ κ°μ Έμ€λ DataFrame.from_records()
λ©μλκ° μμ΅λλ€. μ:
In [1]: import pandas as pd
In [2]: mps = MeasuredParameter.objects.using('stoqs_september2013_o').filter(
...: measurement__instantpoint__activity__platform__name='dorado')
...:
In [3]: df = pd.DataFrame.from_records(mps.values(
...: 'measurement__instantpoint__timevalue', 'measurement__depth',
...: 'measurement__geom', 'parameter__name', 'datavalue', 'id'
...: ))
...:
In [4]: df.head(20)
Out[4]:
datavalue id measurement__depth measurement__geom measurement__instantpoint__timevalue parameter__name
0 2.476802e+01 5664562 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 sigmat
1 1.262683e+00 5673227 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 spice
2 2.546787e+01 5690556 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 altitude
3 1.582349e+01 5577911 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 temperature
4 3.367453e+01 5629901 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 salinity
5 6.593205e+00 5586576 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 oxygen
6 5.360300e+02 5595241 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 nitrate
7 9.528316e-03 5603906 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 bbp420
8 6.610731e-03 5612571 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 bbp700
9 4.761394e-04 5621236 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 fl700_uncorr
10 9.728126e+09 5638566 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 biolume
11 -1.292509e+01 5647231 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 roll
12 -6.497791e+00 5655896 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 pitch
13 5.802254e+01 5664561 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 yaw
14 NaN 5690705 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 sepCountList
15 NaN 5691417 -0.055507 [-121.934897431052, 36.90470983771924] 2013-09-16 20:55:49 mepCountList
16 2.476093e+01 5664563 -0.082238 [-121.93492018129153, 36.90469289678784] 2013-09-16 20:55:47 sigmat
17 1.270436e+00 5673228 -0.082238 [-121.93492018129153, 36.90469289678784] 2013-09-16 20:55:47 spice
18 2.544076e+01 5690555 -0.082238 [-121.93492018129153, 36.90469289678784] 2013-09-16 20:55:47 altitude
19 1.585611e+01 5577910 -0.082238 [-121.93492018129153, 36.90469289678784] 2013-09-16 20:55:47 temperature
λ°λΌμ loadLabeledData
μ κ°μ΄ xμ yλ₯Ό μ‘°μνλ λμ μ΄ μ½λλ‘ μ ν¨μλ₯Ό μμ±νκ³ pandas λ°μ΄ν° νλ μμ λ°νν μ μμ΅λλ€. classify.py
μ μΆκ°νμ¬ μ΄ μμ
μ μννκ±°λ μ νμΌμ μμ±νμκ² μ΅λκΉ?
μ§κΈμ μ νμΌμ λ§λλ κ²μ΄ μ’μ΅λλ€. μλ§λ λΆμμ 보μ¬μ£Όλ Jupyter Notebookμ΄ λ κ²μ λλ€.
λ°λΌμ classify.pyλ₯Ό 보면 μ΄ μ νμΌμ λν΄ process_command_line()
ν¨μλ₯Ό ꡬμ±ν΄μΌ ν©λκΉ?
κΈ°λ₯μ μꡬ μ¬νμ λ μ μ΄ν΄ν΄μΌ ν©λλ€. μλ§λ μλ‘μ΄ μ΅μ (λλ μ΄λ―Έ classify.pyμ μλ μ΄λ§μ μΈ μ΅μ μ ꡬν)μ΄ μ κ·Ό λ°©μμΌ κ²μ λλ€. Jupyter Notebook λ°λͺ¨λ₯Ό λ³΄κ³ μΆμ΅λλ€. κ²°μ νλ λ° λμμ΄ λ κ²μ λλ€.