Scikit-learn: random_state μ„€λͺ…을 더 μœ μ΅ν•˜κ²Œ λ§Œλ“€κ³  μš©μ–΄μ§‘μ„ μ°Έμ‘°ν•˜μ‹­μ‹œμ˜€.

에 λ§Œλ“  2018λ…„ 01μ›” 29일  Β·  60μ½”λ©˜νŠΈ  Β·  좜처: scikit-learn/scikit-learn

μš°λ¦¬λŠ” μ΅œκ·Όμ— λ‹€λ₯Έ 것듀 μ‚¬μ΄μ—μ„œ 곡톡 λ§€κ°œλ³€μˆ˜λ₯Ό μ„€λͺ…ν•˜λŠ” μš©μ–΄μ§‘ 을 λ¬Έμ„œμ— μΆ”κ°€ν–ˆμŠ΅λ‹ˆλ‹€. 이제 random_state λ§€κ°œλ³€μˆ˜μ— λŒ€ν•œ μ„€λͺ…을 λŒ€μ²΄ν•˜μ—¬ 보닀 κ°„κ²°ν•˜κ³  정보λ₯Ό μ œκ³΅ν•΄μ•Ό ν•©λ‹ˆλ‹€(#10415 μ°Έμ‘°). 예λ₯Ό λ“€μ–΄ λŒ€μ‹ 

    random_state : int, RandomState instance or None, optional, default: None
        If int, random_state is the seed used by the random number generator;
        If RandomState instance, random_state is the random number generator;
        If None, the random number generator is the RandomState instance used
        by `np.random`.

KMeans와 MiniBatchKMeans λͺ¨λ‘μ—μ„œ λ‹€μŒμ΄ μžˆμ„ 수 μžˆμŠ΅λ‹ˆλ‹€.

KMeans:
    random_state : int, RandomState instance, default=None
        Determines random number generation for centroid initialization.
        Pass an int for reproducible results across multiple function calls.
        See :term:`Glossary <random_state>`.


MiniBatchKMeans:
    random_state : int, RandomState instance, default=None
        Determines random number generation for centroid initialization and
        random reassignment.
        Pass an int for reproducible results across multiple function calls.
        See :term:`Glossary <random_state>`.

λ”°λΌμ„œ random_state κ°€ μ•Œκ³ λ¦¬μ¦˜μ— λ―ΈμΉ˜λŠ” 영ν–₯에 λŒ€ν•΄ μ„€λͺ…ν•΄μ•Ό ν•©λ‹ˆλ‹€.

이 변경에 κΈ°μ—¬ν•˜λŠ” 데 관심이 μžˆλŠ” κΈ°μ—¬μžλŠ” μ²˜μŒμ— ν•œ λ²ˆμ— ν•˜λ‚˜μ˜ λͺ¨λ“ˆμ„ μˆ˜ν–‰ν•΄μ•Ό ν•©λ‹ˆλ‹€.

μˆ˜μ •ν•  estimator λͺ©λ‘μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

kwinata 슀크립트λ₯Ό μ‚¬μš©ν•˜μ—¬ μˆ˜μ •ν•  파일 λͺ©λ‘

  • [x] [sklearn/dummy.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/dummy.py) - 59
  • [x] [sklearn/multioutput.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/multioutput.py) - 578 , 738
  • [ ] [sklearn/kernel_approximation.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/kernel_approximation.py) - 41 , 143 , 470
  • [ ] [sklearn/multiclass.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/multiclass.py) - 687
  • [x] [sklearn/random_projection.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/random_projection.py) - 178 , 245 , 464 , 586
  • [x] [sklearn/feature_extraction/image.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/image.py) - 368 , 502
  • [x] [sklearn/utils/random.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/random.py) - 39 곡개 홍보
  • [x] [sklearn/utils/extmath.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/extmath.py) - 185 , 297
  • [X]λŠ” sklearn / 앙상블 / _hist_gradient_boosting / gradient_boosting.py (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py) - (736) , (918)
  • [x] [sklearn/ensemble/_hist_gradient_boosting/binning.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_hist_gradient_boosting/binning.py) - 37 , 112

  • [x] [sklearn/ensemble/_bagging.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_bagging.py) - 503 , 902

  • [x] [sklearn/ensemble/_gb.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_gb.py) - 887 , 1360
  • [x] [sklearn/ensemble/_forest.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_forest.py) - 965 , 1282 , 1559 , 1868 , 2103
  • [x] [sklearn/ensemble/_iforest.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_iforest.py) - 109
  • [ ] [sklearn/ensemble/_base.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_base.py) - 52
  • [x] [sklearn/ensemble/_weight_boosting.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_weight_boosting.py) - 188 , 324 , 479 , 900 , 1022
  • [x] [sklearn/decomposition/_truncated_svd.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_truncated_svd.py) - 59 곡개 홍보
  • [x] [sklearn/decomposition/_kernel_pca.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_kernel_pca.py) - 79 곡개 홍보
  • [x] [sklearn/decomposition/_dict_learning.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_dict_learning.py) - 364 , 485 , 692 , 1135 , 1325 곡개 홍보
  • [x] [sklearn/decomposition/_fastica.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_fastica.py) - 205 , 344 곡개 PR
  • [x] [sklearn/decomposition/_nmf.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_nmf.py) - 290 , 475 , 966 , 1159 μ—΄κΈ° 홍보
  • [x] [sklearn/decomposition/_pca.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_pca.py) - 192 곡개 홍보
  • [x] [sklearn/decomposition/_sparse_pca.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_sparse_pca.py) - 82 , 285 곡개 홍보
  • [x] [sklearn/decomposition/_lda.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_lda.py) - 60 , 79 , 225 곡개 홍보
  • [x] [sklearn/decomposition/_factor_analysis.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/_factor_analysis.py) - 92 곡개 PR
  • [x] [sklearn/cluster/_kmeans.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/_kmeans.py) - 56 , 241 , 380 , 583 , 700 , 1150 , 1370
  • [x] [sklearn/cluster/_spectral.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/_spectral.py) - 41 , 197 , 313
  • [x] [sklearn/cluster/_bicluster.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/_bicluster.py) - 236 , 383
  • [x] [sklearn/cluster/_mean_shift.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cluster/_mean_shift.py) - 48
  • [x] [sklearn/preprocessing/_data.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/_data.py) - 2178 , 2607
  • [x] [sklearn/impute/_iterative.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/impute/_iterative.py) - 125
  • [x] [sklearn/linear_model/_ransac.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_ransac.py) - 152 곡개 홍보
  • [x] [sklearn/linear_model/_coordinate_descent.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_coordinate_descent.py) - 580 , 860 , 1313 , 1487 , 1665 , 1851 , 2016 , 2192 곡개 홍보
  • [x] [sklearn/linear_model/_sag.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_sag.py) - 154 곡개 홍보
  • [x] [sklearn/linear_model/_perceptron.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_perceptron.py) - 55 곡개 홍보
  • [x] [sklearn/linear_model/_passive_aggressive.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_passive_aggressive.py) - 76 , 322 곡개 홍보
  • [x] [sklearn/linear_model/_logistic.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_logistic.py) - 587 , 924 , 1100 , 1658 μ—΄κΈ° 홍보
  • [x] [sklearn/linear_model/_base.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_base.py) - 65
  • [x] [sklearn/linear_model/_stochastic_gradient.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_stochastic_gradient.py) - 369 , 811 , 1419 곡개 PR
  • [x] [sklearn/linear_model/_theil_sen.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_theil_sen.py) - 243 곡개 홍보
  • [x] [sklearn/linear_model/_ridge.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/_ridge.py) - 325 , 693 , 853 곡개 홍보
  • [x] [sklearn/tree/_classes.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_classes.py) - 653 , 1033 , 1322 , 1552
  • [x] [sklearn/feature_selection/_mutual_info.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/_mutual_info.py) - 226 , 335 , 414
  • [x] [sklearn/metrics/cluster/_unsupervised.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/cluster/_unsupervised.py) - 80
  • [x] [sklearn/svm/_classes.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/_classes.py) - 90 , 312 , 546 , 752 μ—΄κΈ° 홍보
  • [x] [sklearn/svm/_base.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/_base.py) - 853 곡개 PR
  • [x] [sklearn/inspection/_permutation_importance.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/inspection/_permutation_importance.py) - 81
  • [x] [sklearn/gaussian_process/_gpr.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/gaussian_process/_gpr.py) - 109 , 382
  • [x] [sklearn/gaussian_process/_gpc.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/gaussian_process/_gpc.py) - 110 , 537
  • [x] [sklearn/manifold/_spectral_embedding.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/manifold/_spectral_embedding.py) - 171 , 387
  • [x] [sklearn/manifold/_locally_linear.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/manifold/_locally_linear.py) - 146 , 252 , 584
  • [x] [sklearn/manifold/_t_sne.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/manifold/_t_sne.py) - 558
  • [x] [sklearn/manifold/_mds.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/manifold/_mds.py) - 51 , 198 , 314
  • [x] [sklearn/utils/_testing.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/_testing.py) - 521
  • [x] [sklearn/utils/__init__.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/__init__.py) - 478 , 623
  • [x] [sklearn/datasets/_kddcup99.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_kddcup99.py) - 79
  • [x] [sklearn/datasets/_covtype.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_covtype.py) - 69
  • [x] [sklearn/datasets/_rcv1.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_rcv1.py) - 114
  • [x] [sklearn/datasets/_samples_generator.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_samples_generator.py) - 127 , 323 , 440 , 531 , 618 , 688 , 767 , 904 , 965 , 1,030 , 1,106 , 1,159 , 1,218 , 1,258 , 1,307 , 1,368 , 1,420 , 1,483 , 1,571 , 1,662
  • [x] [sklearn/datasets/_olivetti_faces.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_olivetti_faces.py) - 64
  • [x] [sklearn/datasets/_base.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_base.py) - 157
  • [x] [sklearn/datasets/_twenty_newsgroups.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/_twenty_newsgroups.py) - 187
  • [x] [sklearn/mixture/_bayesian_mixture.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/mixture/_bayesian_mixture.py) - 166
  • [x] [sklearn/mixture/_base.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/mixture/_base.py) - 139
  • [x] [sklearn/mixture/_gaussian_mixture.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/mixture/_gaussian_mixture.py) - 504
  • [x] [sklearn/model_selection/_validation.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/model_selection/_validation.py) - 1006 , 1176
  • [x] [sklearn/model_selection/_split.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/model_selection/_split.py) - 382 , 588 , 1091 , 1196 , 1250 , 1390 , 1492 , 1605 , 2049 곡개 홍보
  • [x] [sklearn/model_selection/_search.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/model_selection/_search.py) - 207 , 1299
  • [x] [sklearn/neural_network/_multilayer_perceptron.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neural_network/_multilayer_perceptron.py) - 782 , 1174
  • [x] [sklearn/neural_network/_rbm.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neural_network/_rbm.py) - 59
  • [x] [sklearn/neighbors/_kde.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/_kde.py) - 233
  • [x] [sklearn/neighbors/_nca.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/_nca.py) - 112
  • [x] [sklearn/covariance/_robust_covariance.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/covariance/_robust_covariance.py) - 63 , 233 , 328 , 545
  • [x] [sklearn/covariance/_elliptic_envelope.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/covariance/_elliptic_envelope.py) - 40
Documentation Moderate Sprint good first issue

κ°€μž₯ μœ μš©ν•œ λŒ“κΈ€

sklearn/preprocessing/_data.py - 2178, 2607μ—μ„œ μž‘μ—…ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€.
@rachelcjordan 및 @fabi-cast

wimlds #SciKitLearnSprint

λͺ¨λ“  60 λŒ“κΈ€

μ•ˆλ…•ν•˜μ„Έμš” @jnothman , 이 문제λ₯Ό

λͺ¨λ“ˆ/ν•˜μœ„ νŒ¨ν‚€μ§€λ₯Ό μš”μ²­ν•˜κ³  κ°€μ‹­μ‹œμ˜€...

2018λ…„ 1μ›” 30일 00:24에 Somya Anand [email protected]이 λ‹€μŒκ³Ό 같이 μΌμŠ΅λ‹ˆλ‹€.

μ•ˆλ…•ν•˜μ„Έμš” @jnothman https://github.com/jnothman , 이 문제λ₯Ό ν•΄κ²°ν•  수 μžˆμŠ΅λ‹ˆκΉŒ? 감사 ν•΄μš”

β€”
당신이 μ–ΈκΈ‰λ˜μ—ˆκΈ° λ•Œλ¬Έμ— 이것을 λ°›λŠ” κ²ƒμž…λ‹ˆλ‹€.
이 이메일에 직접 λ‹΅μž₯ν•˜κ³  GitHubμ—μ„œ ν™•μΈν•˜μ„Έμš”.
https://github.com/scikit-learn/scikit-learn/issues/10548#issuecomment-361243951 ,
λ˜λŠ” μŠ€λ ˆλ“œ μŒμ†Œκ±°
https://github.com/notifications/unsubscribe-auth/AAEz62ie2pMFVg7uM6_MVnmWKRX-efgHks5tPcaHgaJpZM4Rwij3
.

@jnothman μˆœμ§„ν•΄μ„œ μ£„μ†‘ν•˜μ§€λ§Œ λͺ¨λ“ˆ/ν•˜μœ„ λͺ¨λ“ˆμ— λŒ€ν•΄ μžμ„Ένžˆ μ„€λͺ…ν•΄ μ£Όμ‹œκ² μŠ΅λ‹ˆκΉŒ? 예λ₯Ό λ“€μ–΄ Kmeans와 같은 ν•˜μœ„ νŒ¨ν‚€μ§€λ₯Ό λ§μ”€ν•˜μ‹œλŠ” κ±΄κ°€μš”?

@jnothman이 μ˜λ―Έν•˜λŠ” λ°”λŠ” 예λ₯Ό λ“€μ–΄ sklearn/cluster/k_means_.py와 같이 ν•˜λ‚˜μ˜ 파일둜 μ‹œμž‘ν•˜κ³  맨 μœ„ κ²Œμ‹œλ¬Όμ—μ„œμ™€ 같이 random_state λ…μŠ€νŠΈλ§μ„ μ—…λ°μ΄νŠΈν•˜κ³  PR을 μ—¬λŠ” κ²ƒμž…λ‹ˆλ‹€.

ν•˜μœ„ νŒ¨ν‚€μ§€λŠ” sklearn.cluster와 κ°™μŠ΅λ‹ˆλ‹€.

감사 ν•΄μš”. κ·Έλ ‡κ²Œ ν•˜κ³  PR을 μ—΄ κ²ƒμž…λ‹ˆλ‹€.

μ•ˆλ…•ν•˜μ„Έμš”! @jnothman

grid_search.py에 ν‘œμ‹œλœ λ‹€μŒ 주석도 κ΅μ²΄ν•˜μ‹œκ² μŠ΅λ‹ˆκΉŒ? κ·€ν•˜κ°€ κ³΅μœ ν•œ 것과 λΉ„κ΅ν•˜μ—¬ μΆ”κ°€ νšŒμ„ μ΄ μžˆμŠ΅λ‹ˆλ‹€.

random_state : int, RandomState instance or None, optional (default=None)
        Pseudo random number generator state used for random uniform sampling
        from lists of possible values instead of scipy.stats distributions.
        If int, random_state is the seed used by the random number generator;
        If RandomState instance, random_state is the random number generator;
        If None, the random number generator is the RandomState instance used
        by `np.random`.

grid_search.py ​​및 k_means.py(KMeans)λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

grid_search.pyλ₯Ό κ·ΈλŒ€λ‘œ λ‘μ‹­μ‹œμ˜€. 더 이상 μ‚¬μš©λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. μ΅œμ†Œν™”ν•˜μžλŠ” 취지닀.
반볡되고 μš©μ–΄μ§‘μ—μ„œ μ‚¬μš©ν•  수 μžˆλŠ” μ½˜ν…μΈ λ₯Ό
μ‚¬μš©μžμ—κ²Œ random_state의 역할에 λŒ€ν•œ κ°€μž₯ μœ μ΅ν•œ μ„€λͺ…을 μ œκ³΅ν•©λ‹ˆλ‹€.
νŠΉμ • μΆ”μ •κΈ°.

@jnothman κ°μ‚¬ν•©λ‹ˆλ‹€. 이 random_state 정보λ₯Ό λŒ€μ²΄ν•˜κΈ° 전에 μ΄λŸ¬ν•œ μ•Œκ³ λ¦¬μ¦˜μ„ 이해해야 ν•©λ‹ˆκΉŒ?

μ•Œκ³ λ¦¬μ¦˜μ„ κ΄‘λ²”μœ„ν•˜κ²Œ 이해해야 ν•˜μ§€λ§Œ λͺ¨λ“  μ„ΈλΆ€ 사항은 μ•„λ‹™λ‹ˆλ‹€.
κ·Έλ“€μ˜ κ΅¬ν˜„. random_state μœ„μΉ˜λ₯Ό 찾을 수 μžˆμ–΄μ•Ό ν•©λ‹ˆλ‹€.
μ•Œκ³ λ¦¬μ¦˜μ˜ λ¬΄μž‘μœ„ν™”κ°€ μ™„μ „νžˆ λͺ…ν™•ν•˜μ§€ μ•Šμ€ 경우 μ‚¬μš©λ©λ‹ˆλ‹€.
μ–΄λ–€ κ²½μš°μ—λŠ” λ‹€μŒλ³΄λ‹€ 훨씬 더 μžμ„Έν•œ 정보λ₯Ό μ œκ³΅ν•˜μ§€ μ•ŠλŠ” 것이 μ μ ˆν•  μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.
μš©μ–΄μ§‘μ— μ—°κ²°ν•˜κΈ°λ§Œ ν•˜λ©΄ λ©λ‹ˆλ‹€. μ–΄λ–»κ²Œ λ˜λŠ”μ§€ μ§€μΌœλ΄μ•Ό ν•  κ²ƒμž…λ‹ˆλ‹€.

μ’‹μ•„ κ°μ‚¬ν•©λ‹ˆλ‹€. μ•Œκ³ λ¦¬μ¦˜μ„ 천천히 μ‚΄νŽ΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

λ¬Έμ•ˆ 인사,
μ‹œλ°€ λΌμŠ€ν† κΈ°

ν’€ λ¦¬ν€˜μŠ€νŠΈ #10614λ₯Ό μ—΄μ—ˆμŠ΅λ‹ˆλ‹€.

@aby0 은 아직 sklearn.cluster λͺ¨λ“ˆμ„ μš”κ΅¬ν•˜μ§€

μ—…λ°μ΄νŠΈκ°€ μžˆμŠ΅λ‹ˆκΉŒ? μš°λ¦¬μ—κ²ŒλŠ” κΈ΄ νœ΄κ°€μ΄λ―€λ‘œ λ‚΄κ°€ 이것을 선택할 수 μžˆλŠ”μ§€ μ•Œλ €μ£Όμ‹­μ‹œμ˜€.

이미 #10731에 λŒ€ν•œ λ…μŠ€νŠΈλ§μ„ μ‚΄νŽ΄λ³΄κ³  μžˆμœΌλ―€λ‘œ datasets λͺ¨λ“ˆμ„ μ‚¬μš©ν•˜κ² μŠ΅λ‹ˆλ‹€.

λ‚˜λŠ” linear_model λͺ¨λ“ˆμ„ μ£Όμž₯ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. 곧 PR을 올릴 κ²ƒμž…λ‹ˆλ‹€. #11900 제기.

λ‹€μŒμ— decomposition λͺ¨λ“ˆμ„ μ²­κ΅¬ν•©λ‹ˆλ‹€.

이 μž‘μ—…μ„ μˆ˜ν–‰ν•΄μ•Ό ν•˜λŠ” λͺ¨λ“ˆμ˜ 체크리슀트:

  • [ ] 개발자
  • [ ] 곡뢄산
  • [x] λΆ„ν•΄
  • [ ] 더미.py
  • [ ] 앙상블
  • [ ] νŠΉμ§• μΆ”μΆœ
  • [ ] κΈ°λŠ₯ 선택
  • [ ] κ°€μš°μŠ€ ν”„λ‘œμ„ΈμŠ€
  • [ ] kernel_approximation.py
  • [x] #11900을 ν†΅ν•œ linear_model
  • [ ] λ§€λ‹ˆν΄λ“œ
  • [ ] μΈ‘μ •ν•­λͺ©
  • [ ] ν˜Όν•©λ¬Ό
  • [ ] λͺ¨λΈ 선택
  • [ ] λ©€ν‹°ν΄λž˜μŠ€.py
  • [ ] λ‹€μ€‘μΆœλ ₯.py
  • [ ] 이웃
  • [ ] 신경망
  • [ ] μ „μ²˜λ¦¬
  • [ ] random_projection.py
  • [ ] svm
  • [ ] λ‚˜λ¬΄
  • [ ] μœ ν‹Έλ¦¬ν‹°

μ˜¬λ°”λ₯Έ κ· ν˜•μ„ μœ μ§€ν•˜λŠ” 방법에 λŒ€ν•œ ν•©μ˜μ— λ„λ‹¬ν•˜λŠ” 데 λ¬Έμ œκ°€ μžˆμ—ˆμŠ΅λ‹ˆλ‹€.
μ—¬κΈ°, iirc

λ”°λΌμ„œ μœ„μ— λ³‘ν•©λœ 이전 PR에 주의λ₯Ό κΈ°μšΈμ΄μ‹­μ‹œμ˜€.

@jnothman κ°μ‚¬ν•©λ‹ˆλ‹€! intλ₯Ό 전달할 λ•Œ μž¬ν˜„μ„±μ„ μ–ΈκΈ‰ν•˜κΈ° μœ„ν•΄ PR을 μ—…λ°μ΄νŠΈν•©λ‹ˆλ‹€.

κ²€ν† κ°€ μ™„λ£Œλ˜λ©΄ λ‹€λ₯Έ PR의 λ‹€λ₯Έ λͺ¨λ“  λͺ¨λ“ˆμ„ 기꺼이 μ‚¬μš©ν•©λ‹ˆλ‹€...

곡뢄산을 μ£Όμž₯ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

@BlackTeaAndCoffee λŠ” λ¬Έμ„œ λ¬Έμžμ—΄ ν˜•μ‹μ΄ 아직 ν™•μ •λ˜μ§€ μ•Šμ•˜μœΌλ©° 여기에 λ‚˜μ—΄λœ λ‹€λ₯Έ PR에 λŒ€ν•΄ λ…Όμ˜κ°€ 진행

λ‚˜λŠ” feature_extraction을 μ£Όμž₯ν•˜κ³  μžˆλ‹€

@jnothman , @NicolasHug , 방금 #15222 및 이와 κ΄€λ ¨λœ λ§Žμ€ PR을 발견
μŠ€ν”„λ¦°νŠΈμ— λŒ€ν•΄ λͺ…ν™•ν•˜κ²Œ ν•˜κΈ° μœ„ν•΄ 두 가지 문제 쀑 ν•˜λ‚˜λ₯Ό μ’…λ£Œν•  수 μžˆλŠ”μ§€ κΆκΈˆν•©λ‹ˆλ‹€. κ·Έλ ‡λ‹€λ©΄ μ–΄λ–€ λ¬Έμ œμΈκ°€μš”? 쀑볡 정보λ₯Ό ν”Όν•  수 있기 λ•Œλ¬Έμž…λ‹ˆλ‹€. ν˜‘μ‘°ν•΄ μ£Όμ…”μ„œ κ°μ‚¬ν•©λ‹ˆλ‹€.

이 λ¬Έμ œμ— λŒ€ν•΄ μ•Œμ§€ λͺ»ν–ˆμŠ΅λ‹ˆλ‹€(더 잘 ν™•μΈν–ˆμ–΄μ•Ό 함). 이 문제λ₯Ό μœ„ν•΄ https://github.com/scikit-learn/scikit-learn/issues/15222 λ₯Ό λ‹«κ²Œ λ˜μ–΄ κΈ°μ©λ‹ˆλ‹€.

@jnothman μ˜κ²¬μ— 따라 이 λ¬Έμ œκ°€ '보톡' λ ˆμ΄λΈ”μ„ 받을 자격이 μžˆμŠ΅λ‹ˆκΉŒ?

μš°λ¦¬λŠ” 앙상블/_hist_gradient_boosting/binningμ—μ„œ μž‘μ—…ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€.
@mojc 와 λ‚˜.

μœ”μ¦ˆ

@anaisabeldhero 와 μ €λŠ” λ§€λ‹ˆν΄λ“œμ—μ„œ μž‘μ—…ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€/*
#wimlds #SciKitLearnSprint

@daphn3k 그리고 λ‚˜λŠ” sklearn/gaussian_process/μ—μ„œ μž‘μ—…ν•  κ²ƒμž…λ‹ˆλ‹€

wimlds #SciKitLearnSprint

sklearn/preprocessing/_data.py - 2178, 2607μ—μ„œ μž‘μ—…ν•˜κ³  μ‹ΆμŠ΅λ‹ˆλ‹€.
@rachelcjordan 및 @fabi-cast

wimlds #SciKitLearnSprint

λ‚˜μ™€ @Malesche λŠ” sklearn/inspection/_permutation_importance.pyλ₯Ό λ°›κ³  μ‹ΆμŠ΅λ‹ˆλ‹€.

WiMLDS

sklearn/metrics/cluster/_unsupervised.py νŒŒμΌμ„ μš”κ΅¬ν•©λ‹ˆλ‹€! #wimlds

@daphn3k 와 λ‚˜λŠ” 곡뢄산/*κ³Ό 이웃/*도 μ·¨ν•œλ‹€ #wimlds

μ£Όμž₯ν•˜λ‹€:
sklearn/dummy.py - 59
sklearn/multioutput.py - 578, 738
sklearn/kernel_approximation.py - 41, 143, 470
sklearn/multiclass.py - 687
sklearn/random_projection.py - 178, 245, 464, 586

PSA: 원문을 μ‚¬μš©ν•˜μ„Έμš”.

μ—¬λŸ¬ ν•¨μˆ˜ ν˜ΈμΆœμ—μ„œ μž¬ν˜„ κ°€λŠ₯ν•œ κ²°κ³Όλ₯Ό μœ„ν•΄ intλ₯Ό μ „λ‹¬ν•©λ‹ˆλ‹€.

ν˜„μž¬ PRμ—μ„œ 보고 μžˆλŠ” 것 λŒ€μ‹ :

μž„μ˜μ„±μ„ κ²°μ •μ μœΌλ‘œ λ§Œλ“€λ €λ©΄ intλ₯Ό μ‚¬μš©ν•˜μ‹­μ‹œμ˜€.

RNGλŠ” μ „λ‹¬λœ λ‚΄μš©μ— 관계없이 항상 κ²°μ •μ μ΄λ―€λ‘œ μ •ν™•ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

CC @adrinjalali 당신이 μŠ€ν”„λ¦°νŠΈμ— μžˆλ‹€κ³  μƒκ°ν•˜κΈ° λ•Œλ¬Έμ—

신경망 및 ν˜Όν•© μž‘μ—…

PSA: 원문을 μ‚¬μš©ν•˜μ„Έμš”.

μ—¬λŸ¬ ν•¨μˆ˜ ν˜ΈμΆœμ—μ„œ μž¬ν˜„ κ°€λŠ₯ν•œ κ²°κ³Όλ₯Ό μœ„ν•΄ intλ₯Ό μ „λ‹¬ν•©λ‹ˆλ‹€.

ν˜„μž¬ PRμ—μ„œ 보고 μžˆλŠ” 것 λŒ€μ‹ :

μž„μ˜μ„±μ„ κ²°μ •μ μœΌλ‘œ λ§Œλ“€λ €λ©΄ intλ₯Ό μ‚¬μš©ν•˜μ‹­μ‹œμ˜€.

RNGλŠ” μ „λ‹¬λœ λ‚΄μš©μ— 관계없이 항상 κ²°μ •μ μ΄λ―€λ‘œ μ •ν™•ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

CC @adrinjalali 당신이 μŠ€ν”„λ¦°νŠΈμ— μžˆλ‹€κ³  μƒκ°ν•˜κΈ° λ•Œλ¬Έμ—

μ•ˆλ…•ν•˜μ„Έμš” @NicolasHug 이것은 PR에 λŒ“κΈ€μ„ 달기 μœ„ν•œ κ²ƒμ΄μ—ˆμŠ΅λ‹ˆλ‹€... μ–΄λŠ μͺ½μΈκ°€μš”? :)

scikit-learn/sklearn/model_selection/_validation.pyμ—μ„œ μž‘μ—…ν•  μ˜ˆμ •μž…λ‹ˆλ‹€.

@cmarmo λͺ¨λ“  PR에 λŒ€ν•œ 일반적인 μ˜κ²¬μ΄μ—ˆμŠ΅λ‹ˆλ‹€. λ‚˜λŠ” ν•˜λ‚˜λ₯Ό 보고 거기에 λŒ“κΈ€μ„ λ‹¬μ•˜κ³ , 두 번째 것을 보고 그것이 μ†ŒμŠ€μ—μ„œ 더 잘 λ‹€λ£¨μ–΄μ§ˆ νŒ¨ν„΄μ΄λΌλŠ” 것을 μ•Œμ•„λƒˆμŠ΅λ‹ˆλ‹€.

@cmarmo λͺ¨λ“  PR에 λŒ€ν•œ 일반적인 μ˜κ²¬μ΄μ—ˆμŠ΅λ‹ˆλ‹€. λ‚˜λŠ” ν•˜λ‚˜λ₯Ό 보고 거기에 λŒ“κΈ€μ„ λ‹¬μ•˜κ³ , 두 번째 것을 보고 그것이 μ†ŒμŠ€μ—μ„œ 더 잘 λ‹€λ£¨μ–΄μ§ˆ νŒ¨ν„΄μ΄λΌλŠ” 것을 μ•Œμ•„λƒˆμŠ΅λ‹ˆλ‹€.

μ£„μ†‘ν•©λ‹ˆλ‹€ @NicolasHug , μ£„μ†‘ν•©λ‹ˆλ‹€. λŒ“κΈ€μ„ μ‰½κ²Œ 좔적할 수 μ—†μ—ˆμŠ΅λ‹ˆλ‹€.

@NicolasHug @anaisabeldhero 와 μ €μ˜ μ»€λ°‹μ—μ„œ 원본 λ¬Έμž₯이 μˆ˜μ •λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

λ‚˜μ™€ @Olks λŠ” sklearn/utils/extmath.pyλ₯Ό μ£Όμž₯ν•©λ‹ˆλ‹€ - 185, 297

sklearn/ensemble/_iforest.py 청ꡬ - 109

sklearn/neural_network/_multilayer_perceptron.py 청ꡬ - 782, 1174

sklearn/ensemble/_weight_boosting.py 청ꡬ - 188, 324, 479, 900, 1022

sklearn/multioutput.py 청ꡬ - 578, 738

μ£Όμž₯ν•˜λ‹€ :
sklearn/ν˜Όν•©/_bayesian_mixture.py - 166
sklearn/ν˜Όν•©/_base.py - 139
sklearn/mixture/_gaussian_mixture.py - 504

sklearn/ensemble/_gb.py 청ꡬ - 887, 1360

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py 청ꡬ - 736, 918

sklearn/neural_network/_rbm.py 청ꡬ - 59

μ£Όμž₯ν•˜λ‹€ :

sklearn/svm/_classes.py - 90, 312, 546, 752
sklearn/svm/_base.py - 853

μ£Όμž₯ν•˜λ‹€:

sklearn/feature_selection/_mutual_info.py - 226, 335, 414
sklearn/metrics/cluster/_unsupervised.py - 80
sklearn/utils/_testing.py - 521
sklearn/utils/init.py - 478, 623

μ£Όμž₯ν•˜λ‹€ :

sklearn/dummy.py - 59
sklearn/random_projection.py - 178, 245, 464, 586

@DatenBiene @GregoireMialon μ§€λ‚œ μŠ€ν”„λ¦°νŠΈ λ™μ•ˆ λͺ¨λ“  기여에 κ°μ‚¬λ“œλ¦½λ‹ˆλ‹€. μ²΄ν¬λ˜μ§€ μ•Šμ€ λͺ¨λ“ˆμ€ 3κ°œλΏμž…λ‹ˆλ‹€!

관심이 μžˆμœΌμ‹­λ‹ˆκΉŒ / μ‹œκ°„μ΄ μžˆμœΌμ‹­λ‹ˆκΉŒ / κ·ΈλŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•  동기가 μžˆμœΌμ‹­λ‹ˆκΉŒ (μ••λ°• μ—†μŒ!) ?

μ•ˆλ…•ν•˜μ„Έμš” μ œλ ˆλ―Έμž…λ‹ˆλ‹€! μ‘°λ§Œκ°„ ν•œλ²ˆ 보도둝 ν• κ²Œμš”

λ₯΄λ©”λ₯΄. 12μ—΄ 2020 Γ  15:53, JΓ©rΓ©mie du Boisberranger <
[email protected]> μ—ν¬λ¦¬νŠΈ:

@DatenBiene https://github.com/DatenBiene @GregoireMialon
https://github.com/GregoireMialon λͺ¨λ“  기여에 κ°μ‚¬λ“œλ¦½λ‹ˆλ‹€
μ§€λ‚œ μŠ€ν”„λ¦°νŠΈ λ™μ•ˆ. μ²΄ν¬λ˜μ§€ μ•Šμ€ λͺ¨λ“ˆμ€ 3κ°œλΏμž…λ‹ˆλ‹€!

관심이 μžˆμœΌμ‹­λ‹ˆκΉŒ / μ‹œκ°„μ΄ μžˆμœΌμ‹­λ‹ˆκΉŒ / κ·ΈλŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•  동기가 μžˆμŠ΅λ‹ˆκΉŒ?
μ••λ ₯ !) ?

β€”
당신이 μ–ΈκΈ‰λ˜μ—ˆκΈ° λ•Œλ¬Έμ— 이것을 λ°›λŠ” κ²ƒμž…λ‹ˆλ‹€.
이 이메일에 직접 λ‹΅μž₯ν•˜κ³  GitHubμ—μ„œ ν™•μΈν•˜μ„Έμš”.
https://github.com/scikit-learn/scikit-learn/issues/10548?email_source=notifications&email_token=AFY4624NQL3EAFLBGPUNAE3RCQEO3A5CNFSM4EOCFD32YY3PNVWWK3TUL52HS4VVEXG4
λ˜λŠ” ꡬ독 μ·¨μ†Œ
https://github.com/notifications/unsubscribe-auth/AFY4625457AU7OL4E4EUVOTRCQEO3ANCNFSM4EOCFD3Q
.

μ•ˆλ…•ν•˜μ„Έμš” @jeremiedbbμž…λ‹ˆλ‹€! 였늘 남은 λͺ¨λ“ˆ 3개λ₯Ό μ™„μ„±ν•˜λ„λ‘ λ…Έλ ₯ν•˜κ² μŠ΅λ‹ˆλ‹€ πŸ˜ƒ

μ£Όμž₯ν•˜λ‹€:

sklearn/kernel_approximation.py - 41, 143, 470
sklearn/multiclass.py - 687
sklearn/앙상블/_base.py - 52

μ•ˆλ…•ν•˜μ„Έμš” @jnothman 및 @jeremiedbb , μˆ˜μ •λœ λͺ¨λ“  파일처럼 λ³΄μž…λ‹ˆλ‹€. 남아 μžˆλŠ” 문제λ₯Ό 찾으면 기꺼이 λ„μ™€λ“œλ¦¬κ² μŠ΅λ‹ˆλ‹€.

@DatenBiene κ³Ό 이 문제λ₯Ό λ§ˆλ¬΄λ¦¬ν•˜κΈ° μœ„ν•΄ λ…Έλ ₯ν•œ λͺ¨λ“  κΈ°μ—¬μžμ—κ²Œ κ°μ‚¬λ“œλ¦½λ‹ˆλ‹€!
λ‚˜λŠ” μš°λ¦¬κ°€ 이 κ±°λŒ€ν•œ 것을 닫을 수 μžˆλ‹€κ³  μƒκ°ν•©λ‹ˆλ‹€!
random_state μ„€λͺ…에 λŒ€ν•΄ μ—¬μ „νžˆ λˆ„λ½λœ 것이 μžˆλŠ” 경우 μƒˆλ‘œμš΄ νŠΉμ • 문제λ₯Ό 자유둭게 μ—¬μ‹­μ‹œμ˜€.

이 νŽ˜μ΄μ§€κ°€ 도움이 λ˜μ—ˆλ‚˜μš”?
0 / 5 - 0 λ“±κΈ‰