Currently Perl is extended with Moose, Mouse, Moo, and Mo, and the like. These modules all have one commonality -- their attribute selectors are established with has
. I believe we can make the assumption that anything that follows the Moose-established convention of,
has $name|@$names => %options
is a properly formed attribute for one of the many object build systems out there.
We'd have to cover things like
has "foo" => ()
has ("foo" => ())
has [qw/foo bar/] => ()
has ([qw/foo bar/] => ())
@dtikhonov, could you look at this?
Moose et al. are just modules and has
is a function call, not a Perl keyword. Including support for Moose into the ctags Perl parser would place us on a slippery slope, as there are many Perl modules that introduce function calls that look like keywords.
...I'm very interested in this topic. Past 3 years, I have improved ctags to handle DSLs.
See http://docs.ctags.io/en/latest/running-multi-parsers.html?highlight=subparser#tagging-definitions-of-higher-upper-level-language-sub-base about my vision.
Moose subparser can run on Perl parser.
package Point;
use Moose; # automatically turns on strict and warnings
has 'x' => (is => 'rw', isa => 'Int');
has 'y' => (is => 'rw', isa => 'Int');
sub clear {
my $self = shift;
$self->x(0);
$self->y(0);
}
package Point3D;
use Moose;
extends 'Point';
has 'z' => (is => 'rw', isa => 'Int');
after 'clear' => sub {
my $self = shift;
$self->z(0);
};
When the Perl parser finds a keyword "use", Perl parser notifies it to registered subparsers.
Moose subparser is such a subparser.
The Moose subparser activates itself if the notified module name is "Moose".
The Moose subparser scans ctags internal symbol table named cork to find a Perl language object
marked "package" kind by the Perl parser. The Moose subparser may find "Point" in the symbol table.
It is already tagged by the Perl parser. However, the Moose parser makes a new tag for "Point" with "class" kind of "Moose" language.
The Moose subparser pushes the "Point" to scope stack.
The Moose subparser finds "has" as a keyword. The Moose subparser makes a tag for "x" found as the next token to "has". The making the tag for "x", "scope:class:Point language:Moose" is attached to "x"...
About C++ parser, I have already implemented this idea.
QtMoc subparser is for parsing C++ code using Qt. QtMoc subparser recognizes signal and slot, which are introduced as a dialect of C++ language.
To implement a subparser, I have to get permission from the maintainer of its base parser.
I guess @dtikhonov is not so positive because introducing a subparser makes the code of base parser complicated.
If there is confliction between a parser maintainer and me, who wants to introduce a subparser to the parser, I will withdraw from implementing the subparser because having a parser maintainer is the most important thing in the u-ctags project.
@dtikhonov, if you are interested in the idea of implementing Moose as a subparser, let me know.
I will make a prototype to show the impact on the source code of the Perl parser.
Thank you for reading.
I realize it's a function, under the hood. But, that's an implementation detail. In the same sense, packages are "classes" when you're programming in object oriented fashion, and "subs" are methods. All of that nomenclature changes when you use Object Oriented perl, but in each case you could make use of the non-OO terminology. In all the above cases though there are added complexities in inferring which paradigm the user has written the code for. That said, "has" lacks those complexities. If you see it you have two possible interpretations (a) you're calling a function "has" in which case the desired behavior would be to not do anything, as we do now; or, (b) you're using a Moose-esque attribute declaration.
The "Moose et al" quote from your response shows how strong that convention is in Perl to use these module-builder frameworks (which embrace has
-attribute declaration). Moose is easily one of the most popular modules for Perl (it has more stars on metacpan than DBI and LWP).
I am not saying we could not have any false-positives, but I don't know of anyone who has published a module that does not follow this convention and it stands to help a lot of people. Moreover, I truly doubt that this ctags is theoretically sound rather than just practical.
@masatake the subparser idea sounds better and more complete than my implementation, and perl actually already does something similar by testing against use
and invoking different semantics for some of those modules internally.
https://github.com/universal-ctags/ctags/blob/master/parsers/perl.c#L348
I guess the problem there is that you'd have to enumerate the list of modules that trigger the Moose-parser which wouldn't be too bad. You'd also have to assume that no one is running a subclass of Moose which be tricky, but if they were doing that the worse case scenario is they'd lose the Moose-subparser. And, I'm OK with that -- it wouldn't be any worse than what they have that.
If you want help writing a C parser for Moose, I can help out with that too.
I'm almost of the mind set that this kind of static-parser for Perl and Moose would be useful outside of ctags too. Currently parsing perl is insecure, and there are no guarantees. This would allow other projects to securely run test coverage on Perl, for example.
@masatake, Perl 5 is such a slow-moving target that there is not much to maintain, so my job as the maintainer of perl.c
has been easy. I scanned through the subparser documentation you referenced -- all I can say is: wow! To be frank, I haven't been following Universal Ctags development closely and I see you've made great strides.
@EvanCarroll, code talks: feel free to take a shot at it. I am not married to perl.c
, either. Parts of it are outright ugly and, in hindsight, some things should have been implemented differently. With regard to the use of "use": touche, point taken.
Cool, cool. I'll see what I can get done this weekend then. I'll check out the work on the subparser framework too.
A prototype works.
@EvanCarroll, you can start from #2070.
I'm focusing on TypeScript parser. So I will not work on #2070 for awhile.
Hack as you like.
[yamato@slave]~/var/ctags/perl-moose/Units/simple-moose.d% cat input.pl
cat input.pl
# Taken from https://metacpan.org/pod/Moose
package Point;
use Moose; # automatically turns on strict and warnings
has 'x' => (is => 'rw', isa => 'Int');
has 'y' => (is => 'rw', isa => 'Int');
sub clear {
my $self = shift;
$self->x(0);
$self->y(0);
}
package Point3D;
use Moose;
extends 'Point';
has 'z' => (is => 'rw', isa => 'Int');
after 'clear' => sub {
my $self = shift;
$self->z(0);
};
no Moose;
package Line;
[yamato@slave]~/var/ctags/perl-moose/Units/simple-moose.d% ./ctags --fields=+lineK --sort=no -o - --extras=+s ./input.pl
./ctags --fields=+lineK --sort=no -o - --extras=+s ./input.pl
zsh: no such file or directory: ./ctags
[yamato@slave]~/var/ctags/perl-moose/Units/simple-moose.d% ../../ctags --fields=+lineK --sort=no -o - --extras=+s ./input.pl
../../ctags --fields=+lineK --sort=no -o - --extras=+s ./input.pl
Point ./input.pl /^package Point;$/;" package line:2 language:Perl
Point ./input.pl /^package Point;$/;" class line:2 language:Moose
x ./input.pl /^has 'x' => (is => 'rw', isa => 'Int');$/;" attribute line:5 language:Moose class:Point
y ./input.pl /^has 'y' => (is => 'rw', isa => 'Int');$/;" attribute line:6 language:Moose class:Point
clear ./input.pl /^sub clear {$/;" subroutine line:8 language:Perl
clear ./input.pl /^sub clear {$/;" method line:8 language:Moose class:Point
Point3D ./input.pl /^package Point3D;$/;" package line:14 language:Perl
Point3D ./input.pl /^package Point3D;$/;" class line:14 language:Moose inherits:Point end:26
z ./input.pl /^has 'z' => (is => 'rw', isa => 'Int');$/;" attribute line:19 language:Moose class:Point3D
clear ./input.pl /^after 'clear' => sub {$/;" attribute line:21 language:Moose class:Point3D
Line ./input.pl /^package Line;$/;" package line:27 language:Perl
[yamato@slave]~/var/ctags/perl-moose/Units/simple-moose.d% ./ctags --list-subparsers
./ctags --list-subparsers
zsh: no such file or directory: ./ctags
[yamato@slave]~/var/ctags/perl-moose/Units/simple-moose.d% ../../ctags --list-subparsers
../../ctags --list-subparsers
#NAME BASEPARSER DIRECTIONS
AnsiblePlaybook Yaml base <> sub {bidirectional}
Autoconf M4 base <> sub {bidirectional}
Automake Make base <= sub {dedicated}
ITcl Tcl base <> sub {bidirectional}
JAXRS Java base => sub {shared}
LinuxKernel C base => sub {shared}
LinuxOVS C base => sub {shared}
Moose Perl base <> sub {bidirectional}
OpenVSwitch C base => sub {shared}
PythonLoggingConfig Iniconf base <> sub {bidirectional}
PythonMain Python base => sub {shared}
PythonTuned Python base => sub {shared}
QtMoc C++ base <> sub {bidirectional}
RSpec Ruby base => sub {shared}
SystemdUnit Iniconf base <= sub {dedicated}
TclOO Tcl base <> sub {bidirectional}
YumRepo Iniconf base <= sub {dedicated}
selinuxPolicy selinuxPolicyBase base => sub {shared}
servlet Java base => sub {shared}
[yamato@slave]~/var/ctags/perl-moose% ./ctags --list-kinds-full=Moose
#LETTER NAME ENABLED REFONLY NROLES MASTER DESCRIPTION
a attribute yes no 0 NONE attributes
c class yes no 0 NONE classes
m method yes no 0 NONE methods
A prototype works. @EvanCarroll, you can start from #2070.
That was fast! 👍
@masatake impressive. Is there any more work to do? I see you've implemented findAttribute
and the language callback reference?
@masatake impressive. Is there any more work to do?
That is the I would like to know :-).
If I missed what you wanted, please, let me know.
I read only a few parts of the document of Moose.
@masatake cool deal, I reviewed the patch with github. If you can't get around to it, I'll take a look this weekend. =) But you'll probably have it done in 0.0001 seconds.
But you'll probably have it done in 0.0001 seconds.
It is not true.
To improve the parser, we need 4 things: knowledge about ctags internal API, knowledge about the target language, Moose, time, and passion:-P.
I don't have knowledge about Moose. This one is the most important.
e.g. the current parser doesn't capture a rule attached to a class with with
, and defined with "use Moose::Rule". What I know is only "rule". However, you may know more items ctags should capture.
Don't you expect ctags capture such rules?
If a rule is attached to a class, I think the tag for the class should have a field named "rule:
These requests from a person who knows the target language are needed to grow this project.
I can implement some of them. You can implement other ones. We can implement later if such requests are.
We can extend the current implementation to support Mouse, Moo, and Mo. We can do anything you
want WITHOUT modifying the original Perl parser.
Sorry, I found you gave me important comments in #2070 already. Thank you.