Pandas: [Good first issue] TST: Disallow bare pytest.raises

Created on 14 Jan 2020 · 51Comments · Source: pandas-dev/pandas

End users rely on error messages for their debugging purposes. Thus, it is important that we make sure that the correct error messages are surfaced depending on the error triggered.

The core idea is to convert this:

with pytest.raises(klass):
    # Some code that raise an error

To this:

with pytest.raises(klass, match=msg):
    # Some code that raise an error

You can read more about pytest.raises here.

Side note:

In case that the raised error message is an external error message (meaning that's not pandas specific), you should use the external_error_raised instead of pytest.raises.

the usage of external_error_raised is __exactly__ like pytest.raises the only difference is that you don't pass in the match argument.

For example:

import pandas._testing as tm

def test_foo():
    with tm.external_error_raised(ValueError):
        raise ValueError("foo")

Keynotes:

Don't forget to link this issue in your PR, paste this

https://github.com/pandas-dev/pandas/issues/30999

in your PR.

Please comment what you are planning to work on, so we won't do double the work (no need to mention me, you can just declare what you are planning to work on, just remember to check if something is already taken).
If a file/files that should be marked as "done" (as if there is no more work to do), isn't marked as "done", please comment letting me know about that (And mentioning me by putting @MomIsBestFriend at the comment's body, so I'll know where to look).

To generate the full list yourself, you can run:

python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises" pandas/tests/

You can also run it against a single file like:

python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises" pandas/tests/PATH/TO/SPECIFIC/FILE.py

If a file contains a bare pytest.raises, the script will output the following:

pandas/tests/arithmetic/test_numeric.py:553:Bare pytests raise have been found. Please pass in the argument 'match' as well the exception

Which means that at pandas/tests/arithmetic/test_numeric.py on line 553 there is a bare pytest.raises

The current list is:

[x] pandas/tests/io/pytables/test_timezones.py
[ ] pandas/tests/generic/methods/test_pipe.py
[ ] pandas/tests/reshape/merge/test_merge_asof.py
[ ] pandas/tests/extension/base/reduce.py
[x] pandas/tests/arrays/test_datetimelike.py
[ ] pandas/tests/extension/test_boolean.py
[ ] pandas/tests/extension/base/getitem.py
[ ] pandas/tests/arrays/boolean/test_arithmetic.py
[ ] pandas/tests/extension/base/setitem.py
[ ] pandas/tests/indexes/interval/test_astype.py
[ ] pandas/tests/io/parser/test_network.py
[ ] pandas/tests/extension/test_integer.py
[ ] pandas/tests/indexing/multiindex/test_partial.py
[ ] pandas/tests/io/parser/test_python_parser_only.py
[ ] pandas/tests/io/test_html.py
[ ] pandas/tests/reductions/test_stat_reductions.py
[ ] pandas/tests/dtypes/test_inference.py
[ ] pandas/tests/plotting/test_hist_method.py
[ ] pandas/tests/series/apply/test_series_apply.py
[ ] pandas/tests/io/excel/test_xlrd.py
[ ] pandas/tests/indexes/test_common.py
[ ] pandas/tests/util/test_assert_series_equal.py
[ ] pandas/tests/extension/base/ops.py
[ ] pandas/tests/io/test_clipboard.py
[ ] pandas/tests/plotting/frame/test_frame_color.py
[ ] pandas/tests/window/moments/test_moments_ewm.py
[ ] pandas/tests/io/test_gbq.py
[ ] pandas/tests/reductions/test_reductions.py
[ ] pandas/tests/io/test_feather.py
[ ] pandas/tests/resample/test_resampler_grouper.py
[ ] pandas/tests/indexes/multi/test_indexing.py
[ ] pandas/tests/io/test_common.py
[ ] pandas/tests/io/test_sql.py
[ ] pandas/tests/plotting/test_series.py
[ ] pandas/tests/io/test_fsspec.py
[ ] pandas/tests/extension/test_floating.py
[ ] pandas/tests/indexes/multi/test_setops.py
[ ] pandas/tests/reshape/test_get_dummies.py
[ ] pandas/tests/plotting/frame/test_frame_subplots.py
[ ] pandas/tests/plotting/test_backend.py
[ ] pandas/tests/generic/methods/test_sample.py
[ ] pandas/tests/plotting/test_boxplot_method.py
[ ] pandas/tests/io/test_parquet.py
[ ] pandas/tests/extension/test_string.py
[ ] pandas/tests/io/pytables/test_complex.py
[ ] pandas/tests/indexes/test_numpy_compat.py
[ ] pandas/tests/io/test_gcs.py
[ ] pandas/tests/io/sas/test_sas7bdat.py
[ ] pandas/tests/window/test_apply.py
[ ] pandas/tests/series/test_ufunc.py
[ ] pandas/tests/plotting/frame/test_frame.py
[ ] pandas/tests/reshape/test_union_categoricals.py
[ ] pandas/tests/io/json/test_ujson.py
[ ] pandas/tests/indexing/test_coercion.py
[ ] pandas/tests/io/pytables/test_store.py
[ ] pandas/tests/computation/test_compat.py
[ ] pandas/tests/io/json/test_pandas.py
[ ] pandas/tests/io/json/test_json_table_schema.py

- [ ] pandas/tests/scalar/test_nat.py

NOTE:

The list may change as files are moved/renamed constantly.

Took pretty much everything from #23922, that was originally opened by @gfyoung.

Style Testing good first issue

Source

ShaharNaveh

All 51 comments

I'll take:

[x] pandas/tests/test_common.py
[x] pandas/tests/test_downstream.py
[x] pandas/tests/test_errors.py
[x] pandas/tests/test_lib.py
[x] pandas/tests/test_take.py
[x] pandas/tests/internals/test_internals.py
[x] pandas/tests/window/test_rolling.py

ShaharNaveh on 14 Jan 2020

I will begin working on:

pandas/tests/arithmetic/test_numeric.py
pandas/tests/arithmetic/test_object.py
pandas/tests/arithmetic/test_period.py
pandas/tests/arithmetic/test_timedelta64.py
pandas/tests/arrays/interval/test_interval.py

gdex1 on 14 Jan 2020

@gdex1 I hope this will help you :)

(The numbers represents line number)

pandas/tests/arithmetic/test_numeric.py:138
pandas/tests/arithmetic/test_numeric.py:141
pandas/tests/arithmetic/test_numeric.py:190
pandas/tests/arithmetic/test_numeric.py:208
pandas/tests/arithmetic/test_numeric.py:210
pandas/tests/arithmetic/test_numeric.py:212
pandas/tests/arithmetic/test_numeric.py:214
pandas/tests/arithmetic/test_numeric.py:232
pandas/tests/arithmetic/test_numeric.py:234
pandas/tests/arithmetic/test_numeric.py:236
pandas/tests/arithmetic/test_numeric.py:238
pandas/tests/arithmetic/test_numeric.py:519
pandas/tests/arithmetic/test_numeric.py:610
pandas/tests/arithmetic/test_numeric.py:615
pandas/tests/arithmetic/test_numeric.py:617
pandas/tests/arithmetic/test_numeric.py:795
pandas/tests/arithmetic/test_numeric.py:798
pandas/tests/arithmetic/test_numeric.py:819

pandas/tests/arithmetic/test_object.py:140
pandas/tests/arithmetic/test_object.py:152
pandas/tests/arithmetic/test_object.py:154
pandas/tests/arithmetic/test_object.py:278
pandas/tests/arithmetic/test_object.py:280
pandas/tests/arithmetic/test_object.py:282
pandas/tests/arithmetic/test_object.py:284
pandas/tests/arithmetic/test_object.py:298
pandas/tests/arithmetic/test_object.py:301
pandas/tests/arithmetic/test_object.py:315
pandas/tests/arithmetic/test_object.py:318



md5-634e15eb80aa764171dbacd11a06b70b



pandas/tests/arithmetic/test_timedelta64.py:51
pandas/tests/arithmetic/test_timedelta64.py:445
pandas/tests/arithmetic/test_timedelta64.py:607
pandas/tests/arithmetic/test_timedelta64.py:609
pandas/tests/arithmetic/test_timedelta64.py:703
pandas/tests/arithmetic/test_timedelta64.py:705
pandas/tests/arithmetic/test_timedelta64.py:707
pandas/tests/arithmetic/test_timedelta64.py:709
pandas/tests/arithmetic/test_timedelta64.py:741
pandas/tests/arithmetic/test_timedelta64.py:743
pandas/tests/arithmetic/test_timedelta64.py:960
pandas/tests/arithmetic/test_timedelta64.py:972
pandas/tests/arithmetic/test_timedelta64.py:1028
pandas/tests/arithmetic/test_timedelta64.py:1037
pandas/tests/arithmetic/test_timedelta64.py:1039
pandas/tests/arithmetic/test_timedelta64.py:1502
pandas/tests/arithmetic/test_timedelta64.py:1505
pandas/tests/arithmetic/test_timedelta64.py:1508
pandas/tests/arithmetic/test_timedelta64.py:1511
pandas/tests/arithmetic/test_timedelta64.py:1536
pandas/tests/arithmetic/test_timedelta64.py:1591
pandas/tests/arithmetic/test_timedelta64.py:1783
pandas/tests/arithmetic/test_timedelta64.py:1785
pandas/tests/arithmetic/test_timedelta64.py:1911
pandas/tests/arithmetic/test_timedelta64.py:1960
pandas/tests/arithmetic/test_timedelta64.py:1962
pandas/tests/arithmetic/test_timedelta64.py:1968



md5-634e15eb80aa764171dbacd11a06b70b



pandas/tests/arrays/interval/test_interval.py:155

ShaharNaveh on 14 Jan 2020

👍1

@gfyoung the list wasn't generated by grep -r -e "pytest.raises([a-zA-Z]*)" pandas/tests -l in fact, it was generated by the script in #30755 (a validation type called bare_pytest_raises), I will put instructions at the issue body, once it gets merged :smile:

ShaharNaveh on 14 Jan 2020

👍1

@MomIsBestFriend I will help with :
pandas/tests/base/test_constructors.py
pandas/tests/base/test_ops.py

shubchat on 14 Jan 2020

👍1

I can take care of these:
@MomIsBestFriend

pandas/tests/io/test_html.py
pandas/tests/io/test_parquet.py
pandas/tests/io/test_sql.py
pandas/tests/io/test_stata.py
pandas/tests/plotting/test_backend.py
pandas/tests/plotting/test_boxplot_method.py
pandas/tests/plotting/test_frame.py
pandas/tests/plotting/test_hist_method.py
pandas/tests/plotting/test_series.py
pandas/tests/reductions/test_reductions.py

DylanBrdt on 15 Jan 2020

👍1

@MomIsBestFriend there was quite some discussion in https://github.com/pandas-dev/pandas/issues/23922 about to go about this. Because to repeat as I said there: I don't think we should "blindly" assert all error messages.

Some things that were said in that thread: limit it to internal error messages, limit the match to a few key words of the message, avoid complicated patterns.

Also, I think asserting error messages should go hand in hand with actually checking if it is a good, clear error message, and potentially improving this.

It might be good to distill a list of attention points from the discussion in the other issue to put here.

jorisvandenbossche on 15 Jan 2020

@jorisvandenbossche

@MomIsBestFriend there was quite some discussion in #23922 about to go about this. Because to repeat as I said there: I don't think we should "blindly" assert all error messages.

I completely agree, but the problem is that newcomers don't know what error messages to assert and what error messages not to assert, if we somehow define rules on what error messages to assert and what not to assert, and at the same time keeping this issue "beginner friendly", it will be great (IMO).

Also, if we plan to enforce this in the CI we need to somehow mark what bare pytest raises are "bare" on purpose (IMO comment with the style of isort: skip is enough) , and also so other people will know that a particular bare pytest raise is bare on purpose.

Some things that were said in that thread: limit it to internal error messages, limit the match to a few key words of the message, avoid complicated patterns.

I don't see why we wouldn't want to test internal error messages, can you please elaborate even more?

I see the point that you pointed out in https://github.com/pandas-dev/pandas/issues/23922#issuecomment-458551763, and I'm +1 on that, but I'm +2 (if that make any sense) on https://github.com/pandas-dev/pandas/issues/23922#issuecomment-458733117 and https://github.com/pandas-dev/pandas/issues/23922#issuecomment-458735169 because IMO the benefit is larger than the cost.

Also, I think asserting error messages should go hand in hand with actually checking if it is a good, clear error message, and potentially improving this.

Absolutely agree.

It might be good to distill a list of attention points from the discussion in the other issue to put here.

I have read the conversation at #23922, but I didn't saw anything that IMO is worth putting as a "note" in the issue's body, can you please point out things I missed?

ShaharNaveh on 15 Jan 2020

I have read the conversation at #23922, but I didn't saw anything that IMO is worth putting as a "note" in the issue's body, can you please point out things I missed?

I don't see much else to add from that issue either.

I completely agree, but the problem is that newcomers don't know what error messages to assert and what error messages not to assert, if we somehow define rules on what error messages to assert and what not to assert, and at the same time keeping this issue "beginner friendly", it will be great (IMO).

Also, if we plan to enforce this in the CI we need to somehow mark what bare pytest raises are "bare" on purpose (IMO comment with the style of isort: skip is enough) , and also so other people will know that a particular bare pytest raise is bare on purpose.

These are reasons in part why picking and choosing which to test and which not to test is not the direction I would prefer. I would also add that we do sometimes check error message strings in except blocks, so good error messages also benefit us during development.

Also, if these "internal" messages aren't that important, why do we have an error message in the first place? I would then just create a helper that then asserts the message is empty.

gfyoung on 16 Jan 2020

I don't see why we wouldn't want to test internal error messages, can you please elaborate even more?

So I said "limit to internal error messages", while "internal" can be a bit ambiguous... I meant: error messages that originate from pandas itself, and of course we want to test those. But so I meant that we (IMO) shouldn't test too much external error messages, meaning: messages coming from eg numpy or other libraries. Numpy can change those, and then our tests start failing due to a cosmetic change in numpy (and this is not hypothetical, it happened just last week I think).

Now, I used "internal" in a different context in https://github.com/pandas-dev/pandas/pull/30998#discussion_r366726966. There, I meant as an internal, developer oriented error message that should never be raised to the user. IMO, those are not important to test exactly with the error message.

I see the point that you pointed out in #23922 (comment), and I'm +1 on that, but I'm +2 (if that make any sense) on #23922 (comment) and #23922 (comment) because IMO the benefit is larger than the cost.

Let's put @simonjayhawkins's comment that you link to here:

I am working on the assumption, maybe incorrectly, that it will be beneficial to

identify tests that can be parametrized

identify tests that should be split

better understanding of the failure mode tested

indirectly add more documentation to the tests

identify where error messages could be more consistent

identify tests that are redundant

help improve error messages

identify tests that are currently passing for the wrong reason.

That are all useful things, I fully agree. But that is not simple, and if we want to get those things out of this issue, then this issue is not for beginners. Of course, beginners don't need to do all of those things at once, but I still have the feeling that those PRs adding asserts are often rather close to "blindly adding the current error message to the pytest.raises call" without going any further (the above points).

Also, if the above points is what makes this exercise useful, it are more concrete instructions about this that is useful to put at the top of this issue, I think.

To be clear, I am all for better error messages and better tests asserting we have and keep those error messages good. But we also have limited time, and each PR requires time and effort to do and to review, and the question is where effort is best spent.
IMO, it would be more useful instead to focus on "fix bare pytest raises" rather to focus on "improve error messages" (and while doing this, better test them).

jorisvandenbossche on 16 Jan 2020

Also, if the above points is what makes this exercise useful, it are more concrete instructions about this that is useful to put at the top of this issue, I think.

It might make sense to create a larger issue to track these (other issues worth including in such an issue are https://github.com/pandas-dev/pandas/issues/19159 and https://github.com/pandas-dev/pandas/issues/21575).

This part in itself is self-contained and is very approachable for beginners.

gfyoung on 16 Jan 2020

@gfyoung how are those issues you link related to this discussion?

jorisvandenbossche on 16 Jan 2020

They relate to the comment that you explicitly introduced from @simonjayhawkins

gfyoung on 16 Jan 2020

31072 contains the one missing match in `stata.py`

bashtage on 16 Jan 2020

👍1

As saying here https://github.com/pandas-dev/pandas/pull/31091#issuecomment-575422207 I'm with @jorisvandenbossche's idea, that we won't test error messages from external packages, Any ideas on how to mark those?

ShaharNaveh on 17 Jan 2020

If we really don't want to test certain error messages (I could go either way on external ones to be fair), I think we should just create a helper function like this:

~python
def external_error_raised(expected_exception):
return pytest.raises(expected_exception, match=None)
~

This will make it clear to our future selves that this is a non-pandas error, and the match=None serves to appease any linting check we develop for bare pytest raises.

gfyoung on 17 Jan 2020

If we really don't want to test certain error messages (I could go either way on external ones to be fair), I think we should just create a helper function like this:
def external_error_raised(expected_exception):
   return pytest.raises(expected_exception, match=None)
This will make it clear to our future selves that this is a non-pandas error, and the match=None serves to appease any linting check we develop for bare pytest raises.

+1 on that.

I really like that idea, can we make it a convention for our tests?

That if a function is testing if a function/method is raising an error, and the error is an external error, we simply put match=None in the "pytest.raises```.

can we make it a convention for our tests?

By that I mean putting a section on in the Contributing guide.

ShaharNaveh on 17 Jan 2020

That if a function is testing if a function/method is raising an error, and the error is an external error, we simply put match=None in the "pytest.raises```.

I would prefer the helper function since you then wouldn't have to think about adding that. Also, the helper name is much clearer as to why we're doing it.

gfyoung on 17 Jan 2020

If we really don't want to test certain error messages (I could go either way on external ones to be fair), I think we should just create a helper function like this:
def external_error_raised(expected_exception):
   return pytest.raises(expected_exception, match=None)
This will make it clear to our future selves that this is a non-pandas error, and the match=None serves to appease any linting check we develop for bare pytest raises.

@gfyoung where do you recommend putting this helper function? (as if in what file?)

ShaharNaveh on 18 Jan 2020

pandas._testing

gfyoung on 18 Jan 2020

👍1

Hello,

I would like to work on :

pandas/tests/arrays/interval/test_ops.py
pandas/tests/arrays/test_array.py
pandas/tests/arrays/test_boolean.py

ukarroum on 24 Jan 2020

Hello - I would like to work on:

pandas/tests/arithmetic/test_period.py
pandas/tests/arithmetic/test_timedelta64.py

GrugLife on 1 Feb 2020

Hello all, I'll take the following:

pandas/tests/computation/test_compat.py
pandas/tests/computation/test_eval.py
pandas/tests/dtypes/cast/test_upcast.py
pandas/tests/dtypes/test_dtypes.py

Vlek on 2 Feb 2020

@MomIsBestFriend this one is done already but isn't marked as done:
pandas/tests/arithmetic/test_numeric.py

UPDATE

@MomIsBestFriend these too:
pandas/tests/arithmetic/test_period.py
pandas/tests/arrays/test_integer.py
pandas/tests/arrays/test_period.py

MarcoGorelli on 9 Feb 2020

👍1

These are included in #31852

pandas/tests/extension/decimal/test_decimal.py
pandas/tests/extension/json/test_json.py
pandas/tests/extension/test_boolean.py
pandas/tests/extension/test_categorical.py
pandas/tests/frame/indexing/test_categorical.py
pandas/tests/frame/indexing/test_indexing.py
pandas/tests/frame/indexing/test_where.py
pandas/tests/frame/methods/test_explode.py
pandas/tests/frame/methods/test_isin.py
pandas/tests/frame/methods/test_quantile.py
pandas/tests/frame/methods/test_round.py
pandas/tests/frame/methods/test_sort_values.py
pandas/tests/frame/methods/test_to_dict.py

RaisaDZ on 19 Feb 2020

I'll take

pandas/tests/io/excel/test_readers.py
pandas/tests/io/excel/test_writers.py
pandas/tests/io/excel/test_xlwt.py
pandas/tests/io/formats/test_format.py
pandas/tests/io/formats/test_style.py
pandas/tests/io/formats/test_to_latex.py

quangngd on 19 Mar 2020

@MomIsBestFriend
These one are done without the mark:

pandas/tests/indexes/datetimes/test_astype.py

pandas/tests/indexes/datetimes/test_tools.py does not exist

I'll do:

pandas/tests/indexes/datetimes/test_constructors.py
pandas/tests/indexes/datetimes/test_date_range.py
pandas/tests/indexes/datetimes/test_indexing.py
pandas/tests/indexes/datetimes/test_shift.py
pandas/tests/indexes/datetimes/test_timezones.py

quangngd on 21 Mar 2020

I have updated the original post, now that there's a script to detect bare pytest raises I have included instructions on how to use it, if anyone still got questions you are more than welcome to ask:)

ShaharNaveh on 23 Mar 2020

I'll take,

pandas/tests/arithmetic/test_timedelta64.py

pandas/tests/scalar/timestamp/test_arithmetic.py
pandas/tests/scalar/timestamp/test_comparisons.py
pandas/tests/scalar/timestamp/test_constructors.py
pandas/tests/scalar/timestamp/test_timezones.py
pandas/tests/scalar/timestamp/test_unary_ops.py

sathyz on 24 Mar 2020

seems all tests in pandas/tests/scalar/timestamp/ are already fixed.

$ git checkout master
Already on 'master'
$ python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises"  pandas/tests/scalar/timestamp/
$

pandas/tests/arrays/test_boolean.py => is missing.

I'm taking
pandas/tests/arrays/interval/test_ops.py
pandas/tests/arrays/test_datetimelike.py

pandas/tests/groupby/test_categorical.py
pandas/tests/groupby/test_groupby.py
pandas/tests/groupby/test_timegrouper.py

sathyz on 25 Mar 2020

pandas/tests/arithmetic/test_timedelta64.py => #33010

pandas/tests/scalar/timestamp/test_arithmetic.py => no issue
pandas/tests/scalar/timestamp/test_comparisons.py => no issue
pandas/tests/scalar/timestamp/test_constructors.py => no issue
pandas/tests/scalar/timestamp/test_timezones.py => no issue
pandas/tests/scalar/timestamp/test_unary_ops.py => no issue

pandas/tests/arrays/test_boolean.py => is missing.

pandas/tests/arrays/interval/test_ops.py => #33010
pandas/tests/arrays/test_datetimelike.py => #33010

pandas/tests/groupby/test_categorical.py => #33144
pandas/tests/groupby/test_groupby.py => no issue
pandas/tests/groupby/test_timegrouper.py => no issue

pandas/tests/indexes/categorical/test_category.py => no issue
pandas/tests/indexes/common.py #33144
pandas/tests/indexes/datetimelike.py #33144

pandas/tests/indexes/interval/test_astype.py => all of the affected tests are marked to be xfailed, do we still need to fix, is so how?

pandas/tests/indexes/multi/test_compat.py #33144
pandas/tests/indexes/multi/test_duplicates.py => no issue
pandas/tests/indexes/multi/test_format.py => file not found.
pandas/tests/indexes/multi/test_reshape.py #33144
pandas/tests/indexes/multi/test_setops.py => no issue
pandas/tests/indexes/multi/test_sorting.py #33144

sathyz on 28 Mar 2020

@sumanau7 did you list the files that you are taking up? I'm working on some of the files that I see you have merged.

sathyz on 30 Mar 2020

Done with

pandas/tests/indexes/categorical/test_category.py
pandas/tests/indexes/period/test_constructors.py
pandas/tests/indexes/period/test_join.py
pandas/tests/indexes/period/test_partial_slicing.py
pandas/tests/indexes/period/test_setops.py
pandas/tests/indexes/timedeltas/test_delete.py

sumanau7 on 30 Mar 2020

I'm working with

pandas/tests/indexes/ranges/test_constructors.py
pandas/tests/indexes/ranges/test_range.py
pandas/tests/indexing/multiindex/test_chaining_and_caching.py
pandas/tests/indexing/multiindex/test_partial.py
pandas/tests/series/indexing/test_alter_index.py
pandas/tests/arrays/boolean/test_function.py

proost on 19 Apr 2020

Working on:

pandas/tests/reshape/merge/test_multi.py

DominicG88 on 26 Apr 2020

I'll take:
pandas/tests/window/moments/test_moments_ewm.py
pandas/tests/window/moments/test_moments_rolling.py
pandas/tests/window/test_dtypes.py
pandas/tests/window/test_ewm.py
pandas/tests/window/test_expanding.py
pandas/tests/window/test_timeseries_window.py

boweyism on 20 Jun 2020

Ill also take:

pandas/tests/frame/methods/test_assign.py
pandas/tests/frame/methods/test_at_time.py
pandas/tests/frame/methods/test_between_time.py
pandas/tests/frame/methods/test_first_and_last.py
pandas/tests/frame/methods/test_interpolate.py
pandas/tests/frame/methods/test_replace.py
pandas/tests/frame/test_query_eval.py

boweyism on 21 Jun 2020

Hi,
I'm a new developer to the project and I'd like to help with this. Which of the remaining tests are best for a beginner?

Thanks,
Kevin

kevomaco on 14 Jul 2020

Hi,
I'm a new developer to the project and I'd like to help with this. Which of the remaining tests are best for a beginner?

Thanks,
Kevin

Welcome - I don't think any of these are any easier or harder than any others, any would be a good place to start

MarcoGorelli on 14 Jul 2020

I'm getting an error running the validate_unwanted_patterns.py script:

Traceback (most recent call last):
  File "C:\Users\Kevom\git\pandas\scripts\validate_unwanted_patterns.py", line 397, in <module>
    main(
  File "C:\Users\Kevom\git\pandas\scripts\validate_unwanted_patterns.py", line 352, in main
    for line_number, msg in function(file_obj):
  File "C:\Users\Kevom\git\pandas\scripts\validate_unwanted_patterns.py", line 88, in bare_pytest_raises
    contents = file_obj.read()
  File "C:\Program Files (x86)\Python\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 76843: character maps to <undefined>

It's occurring when reading 'pandas/tests/test_strings.py' All of the files are encoded as cp1252.

I wanted to run the script to double check which tests are still unfinished to avoid duplicating work.

kevomaco on 17 Jul 2020

Most of the NotImplementedErrors don't have a specific message to be matched, I know it kind of defeats the purpose but is it a good idea to change them to pytest.raises(NotImplementedError, match=None) just to silence the linter?

MJafarMashhadi on 28 Jul 2020

I'll take:

pandas/tests/tools/test_to_datetime.py
pandas/tests/tseries/offsets/test_offsets.py

to start.

-Kevin

kevomaco on 7 Aug 2020

I am new to this so will start with this:

pandas/tests/tseries/offsets/test_ticks.py

krajatcl on 27 Sep 2020

Hello, I am new to contributing. Thanks for the clear write-up in the issue. I'll start with taking pandas/tests/generic/test_duplicate_labels.py, and will tackle some more if it works out.

theMogget on 14 Oct 2020

I will take pandas/tests/arrays/test_datetimelike.py as a start.
Also, if you cannot run python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises" pandas/tests/successfully, try
python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises" pandas/tests/**/*.py instead

liaoaoyuan97 on 15 Nov 2020

The easiest way to run it now would be to add

    -   id: unwanted-patterns-bare-pytest-raises
        name: Check for use of bare use of pytest raises
        language: python
        entry: python scripts/validate_unwanted_patterns.py --validation-type="bare_pytest_raises"
        types: [python]
        files: ^pandas/tests/

to .pre-commit-config.yaml in the - repo: local section, and then run

pre-commit run unwanted-patterns-bare-pytest-raises --all-files.

I've updated the issue with the remaining outstanding files

MarcoGorelli on 15 Nov 2020

I can take these:

[x] pandas/tests/io/pytables/test_timezones.py
[ ] pandas/tests/generic/methods/test_pipe.py
[ ] pandas/tests/reshape/merge/test_merge_asof.py
[ ] pandas/tests/extension/base/reduce.py
[ ] pandas/tests/extension/base/getitem.py
~~pandas/tests/arrays/test_datetimelike.py~~

These are the top 5 in the list as of today.

marktgraham on 18 Nov 2020

@marktgraham If you haven't done test_datetime.py yet. Please leave it alone as I am about to make a PR

liaoaoyuan97 on 19 Nov 2020

@liaoaoyuan97 no worries, I haven't touched test_datetimelike.py yet.

I will take pandas/tests/extension/base/getitem.py instead.

marktgraham on 19 Nov 2020

validate_unwanted_patterns.py raises an error on my side

$ python scripts/validate_unwanted_patterns.py -vt="bare_pytest_raises" pandas/tests/
Traceback (most recent call last):
  File "scripts/validate_unwanted_patterns.py", line 479, in <module>
    output_format=args.format,
  File "scripts/validate_unwanted_patterns.py", line 435, in main
    with open(file_path, encoding="utf-8") as file_obj:
IsADirectoryError: [Errno 21] Is a directory: 'pandas/tests/'

Seems to be related to #37419 perhaps?

I tried with the approach proposed by @MarcoGorelli and worked perfectly.

The easiest way to run it now would be to add

    -   id: unwanted-patterns-bare-pytest-raises
        name: Check for use of bare use of pytest raises
        language: python
        entry: python scripts/validate_unwanted_patterns.py --validation-type="bare_pytest_raises"
        types: [python]
        files: ^pandas/tests/

to .pre-commit-config.yaml in the - repo: local section, and then run

pre-commit run unwanted-patterns-bare-pytest-raises --all-files.

Does it make sense to add this to the .pre-commit-config.yaml and then update the instructions on this thread?

lucasrodes on 20 Nov 2020

Does it make sense to add this to the .pre-commit-config.yaml and then update the instructions on this thread?

We will add it to .pre-commit-config.yaml once all the errors it raises are fixed, yes

Seems to be related to #37419 perhaps?

No, it's related to #37379 (which is when we moved this script over to pre-commit, hence it was no longer necessary for it to run on directories)

MarcoGorelli on 20 Nov 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

ValueError plotting bar plot from DataFrame with existing Axes

swails · 3Comments

can't plot multi-row subplots

ericdf · 3Comments

Pandas get_dummies() and n-1 Categorical Encoding Option to avoid Collinearity?

jaradc · 3Comments

Can't read csv using python pandas

Ashutosh-Srivastav · 3Comments

Hexbin plots does not display x label and xtick labels

BDannowitz · 3Comments

Pandas: [Good first issue] TST: Disallow bare pytest.raises

Side note:

Keynotes:

The current list is:

- [ ] pandas/tests/scalar/test_nat.py

NOTE:

All 51 comments

31072 contains the one missing match in stata.py

UPDATE

Related issues

31072 contains the one missing match in `stata.py`