Latex3: define \tl_if_integer:n

Created on 5 Jul 2021  ·  19Comments  ·  Source: latex3/latex3

Hi,

It will nice to test wether a token list is an integer or not, without resorting to use internal __int_to_roman:w:

\prg_new_conditional:Npnn \tl_if_integer:n #1 { p, T, F, TF }
{
  \tl_if_blank:oTF { \__int_to_roman:w -0#1 }
   {
    \prg_return_true:
   }
   {
    \prg_return_false:
   }
}

Most helpful comment

I agree it would make sense to provide \str_if_integer:nTF, but it is not yet clear what the allowed integers should be. Presumably spaces should be allowed between signs, but not within an integer since TeX disallows it?

Code based on \romannumeral breaks for long strings of digits, with a "number too big" error. Presumably we need a slower process that only grabs a few digits at a time, but then it gets annoying to detect spaces between digits.

An alternative/addition would be \str_to_integer:nF that sanitizes the string if it looks "enough" like an integer (we could decide to be somewhat lax (but well documented)) and otherwise leaves its second argument in the input stream.

All 19 comments

Better here:

\prg_new_conditional:Npnn \tl_if_integer:n #1 { p, T, F, TF }
{
  \tl_if_blank:nTF{#1}
    {
      \prg_return_false:
    }
    {
      \tl_if_blank:oTF { \__int_to_roman:w -0#1 }
      {
        \prg_return_true:
      }
      {
        \prg_return_false:
      }
    }
}

The name seems odd: It doesn't actually check for an integer (e.g. -5 is an integer but returns false, +7 returns false too). It seems more like a test if only digits are present.

@zauguin Ok for the name but the function is useful, and match the old tick of tex book to check if digit. I agree that integer could be implemented using regex but it will be slow

Improved with your comments:

\prg_new_conditional:Npnn \tl_if_digit:n #1 { p, T, F, TF }
{
  \tl_if_blank:nTF{#1}
    {
      \prg_return_false:
    }
    {
      \tl_if_blank:oTF { \__int_to_roman:w -0#1 }
      {
        \prg_return_true:
      }
      {
        \prg_return_false:
      }
    }
  }
\prg_generate_conditional_variant:Nnn \tl_if_digit:n { o } { p, T, F, TF }
\prg_new_conditional:Npnn \__tl_if_integer:n #1 { p, T, F, TF }
{
    \exp_args:No\str_if_eq:onTF{\tl_head:n{#1}}{+}
    {
      \exp_args:No\tl_if_digit:oTF{\tl_tail:n{#1}}
      {
        \prg_return_true:
      }
      {
        \prg_return_false:
      }
    }
    {
      \exp_args:No\str_if_eq:onTF{\tl_head:n{#1}}{-}
      {
        \exp_args:No\tl_if_digit:oTF{\tl_tail:n{#1}}
        {
          \prg_return_true:
        }
        {
          \prg_return_false:
        }
      }
      {
        \prg_return_false:
      }
    }
}
\prg_new_conditional:Npnn \tl_if_integer:n #1 { p, T, F, TF }
{
  % fast path
  \tl_if_digit:nTF{#1}
  {
    \prg_return_true:
  }
  {
    % slow path
    \__tl_if_integer:nTF{#1}
    {
      \prg_return_true:
    }
    {
      \prg_return_false:
    }
  }
}

@bastien-roucaries TeX accepts multiple signs in front of a number, so your test is usually enough but not always. Should this be implemented, it could allow multiple signs like this:

\ExplSyntaxOn
\prg_new_conditional:Npnn \str_if_integer:n #1 { p, T, F, TF }
  { \exp_after:wN \__str_if_integer_sign:N \tl_to_str:n {#1} \scan_stop: }
\cs_new:Npn \__str_if_integer_sign:N #1
  {
    \if:w $
        \if_meaning:w - #1 F \fi:
        \if_meaning:w + #1 F \fi: $
      \exp_after:wN \__str_if_integer_digits:w \exp_after:wN #1
    \else:
      \exp_after:wN \__str_if_integer_sign:N
    \fi:
  }
\cs_new:Npn \__str_if_integer_digits:w #1 \scan_stop:
  {
    \tl_if_blank:nTF {#1}
      { \prg_return_false: }
      {
        \tl_if_blank:oTF { \__int_to_roman:w -0#1 }
          { \prg_return_true: }
          { \prg_return_false: }
      }
  }

\cs_new:Npn \test #1
  { \typeout { #1: \str_if_integer:nTF {#1} { INT } { NOT } } }

\test {   }
\test { ~ }
\test { 1 }
\test { - }
\test { -1 }
\test { +1 }
\test { +-+-+-1 }

\stop

The test is str to prevent expansion of tokens in the argument. Both mine and your implementation do not expand the sign, but do expand the digits with \romannumeral. A tl version would probably have to expand everything as needed.

I agree it would make sense to provide \str_if_integer:nTF, but it is not yet clear what the allowed integers should be. Presumably spaces should be allowed between signs, but not within an integer since TeX disallows it?

Code based on \romannumeral breaks for long strings of digits, with a "number too big" error. Presumably we need a slower process that only grabs a few digits at a time, but then it gets annoying to detect spaces between digits.

An alternative/addition would be \str_to_integer:nF that sanitizes the string if it looks "enough" like an integer (we could decide to be somewhat lax (but well documented)) and otherwise leaves its second argument in the input stream.

@blefloch What is the documented limit of \romannumeral ?

@bastien-roucaries The biggest integer TeX can handle: 2³¹−1 = 2147483647.

@PhelypeOleinik It seems that @blefloch means that +-+-+-+-+-+-+-+-+-+-+-+-+000000000000000000000000000000000000000000000000000000000000000 could break \romannumeral...

So what is the max string size ?

@bastien-roucaries \romannumeral only breaks with “Number too big” for a nonzero number. TeX can take an arbitrarily long string of zeros as long as the resulting integer is within the [−2³¹−1, 2³¹−1] range.

@bastien-roucaries @blefloch I'd not forget integers in hexadecimal or octal format. And

-`a

is an integer too.

there is an internal test for digits in l3bitset \__bitset_test_digits:nTF. It doesn't handle signs as this was unneeded but it should handle integer expressions.

as @eg9 just hinted, what is the actual use case here? Is it a user level document check for a string of digits, or a check that the string is a valid input to an expl3 integer function. The latter seems more useful in an expl3 context, and basically means that nothing is left after \numexpr#1 so the slow check for literal strings of digits would not be needed.

@davidcarlisle This bug is about the first case, but i will like to have the second use

I don't think that I'm in favor of extending the L3 programming layer with random extensions that are (possibly) useful in special situations but not have clear regularly needed use cases. To me this is one such example. If we start with that, where do we end? How many special "is this tl some X" should we support then? My vote on this is no; not for the core language.

@FrankMittelbach Knowing f we can do some computation or not on TL is uselful. The other could be done using regex, but with the pitfall that it does not work in expand context (and I need this test to work in expand context)

No doubt. But is it useful enough to warrant its inclusion into the core language (or only to be provided as part of some package code when actually needed)? That's the question for me here and so far I haven't seen really arguments that favor its inclusion.

You can ask the same question for "is it a dimension", "is it as skip", " is it just letters", "does it contain non-ascii characters", "is it a date format", and, and, and ... the possibilities are quite open ended and for most of them you can construct a use case or two. But 99% of the time they will just sit there and take up space. And while space is not a premium as it was in the past it still adds up to the complexity and maintainability of the core system. So again, is this functionality with wide and repeated use? If yes, and I hear convincing arguments for that it could be considered a candidate for inclusion. If not, it should be implemented as part of the code that needs it.

@FrankMittelbach The problem is specific to biblatex here. We need to test if some field is an integer (think about pages number) and so do some aithmetic on it, in a expandable context.
Ok ti is specific but the tricky part is the expandable stuff.
May be this kind of test should only be documented. Reading the doc I do not know how to do, and moreover biblatex core thinks it is not possible to test (expand comptabible) if a token if an integer.

May be the solution is to render regex expandable (I think it will be nice)

Don't get me wrong, I'm not at all against that we help you on the biblatex side to make this work, and it may well be that on the core we are missing some functionality that should be in core to support that. My point is that the core should restrict itself to needs that are "general" and to provide the basic general foundation to write special code, but not provide all kind of extensions that are used very seldom if at all (e.g in this case when biblatex is not used). Otherwise we will end up with a very bloated set of commands eventually.

Making things work expandably usually comes with a heavy pricetag, either in limited functionality or loss of speed or both. So I doubt it is a good idea to try to make regex expandable, but I let @blefloch comment on that.

But without knowning your use case in more detail: somewhere your fields are being set up and that is not done expandably as it will require assignments, so at that stage you could determine if you have an integer or something else and record that fact which could then be expandably used (not sure that helps or is feasible, but it would avoid testing the field over and over again).

Was this page helpful?
0 / 5 - 0 ratings