Skip to content

Mixing slices and individual labels to select along the same axis #429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alixdamman opened this issue Sep 18, 2017 · 12 comments
Open

Mixing slices and individual labels to select along the same axis #429

alixdamman opened this issue Sep 18, 2017 · 12 comments

Comments

@alixdamman
Copy link
Collaborator

alixdamman commented Sep 18, 2017

The idea is to be able to select a subset as follow:

>>> arr = ndtest(6)
>>> arr
a  a0  a1  a2  a3  a4  a5
    0   1   2   3   4   5
>>> arr["a0:a2,a4"]
a  a0  a1  a2  a4
    0   1   2   4

Is there any good reason why it doesn't work up to now?

Current workaround is:

>>> arr[X.a["a0:a2"].union(X.a["a4"])]
@gdementen
Copy link
Contributor

gdementen commented Sep 18, 2017

this is basically a duplicate of #360. There is a reason but I have no time to explain that now.

@alixdamman
Copy link
Collaborator Author

That reason is technical or philosophical? (this issue comes from the demand of one our current user)

@gdementen
Copy link
Contributor

I would say neither. It is because it has many syntax implications on other stuff and I never had both the time AND a clear enough mind to think this through. It is related to #23 and the whole constellation of related issues and how to make our API future-proof.

>>> a = Axis('a=a0..a4')
>>> # when using the string syntax, it seems natural that:
>>> a['a0:a2,a4']
a[['a0', 'a1', 'a2', 'a4']]
>>> # and
>>> a['a0:a2;a4']
(a['a0':'a2'], a['a4'])
>>> # BUT I want to have a way to do it easily using the non string syntax AND keep both syntaxes consistent
>>> a['a0':'a2', 'a4']
a[['a0', 'a1', 'a2', 'a4']]
OR? 
(a['a0':'a2'], a['a4'])

and I have been leaning on the second one because we can easily add something to go from multiple groups to a single group, but not the opposite. But given the inconsistency with the string syntax, this yielded a deadlock.

The question is thus: is there an alternative to make creating several groups containing slices on the same axis easily?

FWIW, for non slices, this is a non-issue:

>>> a[['a0', 'a1', 'a2'], ['a4']]
(a['a0', 'a1', 'a2'], a['a4'])
>>> a[['a0', 'a1', 'a2', 'a4']]
a[['a0', 'a1', 'a2', 'a4']]

@gdementen
Copy link
Contributor

One alternative would be to use a "slice builder" (e.g S) and move subgroups in lists, but this feels too hard/unatural to my tastes:

>>> a[[S['a0':'a2']], ['a4']]
(a['a0':'a2'], a['a4'])
>>> a['a0':'a2', 'a4']
a[['a0', 'a1', 'a2', 'a4']]

@gdementen
Copy link
Contributor

Assuming we keep the , with scalars to mean "concatenate" to match the string syntax:

>>> a['a0':'a2', 'a4']
a[['a0', 'a1', 'a2', 'a4']]

We have a few options:

>>> # in the worst case, we can use the following syntax. We will get it "for free" because :
>>> # * we need anyway to implement G[] to easily create groups without axis, 
>>> # * we need to keep `axis[several, group]` returning `(axis[several], axis[group])`
>>> a[G['a0':'a2'], G['a4']]
(a['a0':'a2'], a['a4'])
>>> # but we could implement this too:
>>> a.groups['a0':'a2', 'a4']
(a['a0':'a2'], a['a4'])
>>> # unsure what we should return in that case though (maybe raise an exception?):
>>> a[G['a0':'a2'], 'a4']

FWIW: here is one of my use cases in MOSES:

>>> # current code
>>> clength_groups = (x.clength[1:15], x.clength[16:25], x.clength[26:30], x.clength[31:35], x.clength[36:40], x.clength[41:50])
>>> # potential improvements
>>> clength_groups = x.clength[G[1:15], G[15:25], G[26:30], G[31:35], G[36:40], G[41:50]]
>>> clength_groups = x.clength.groups[1:15, 15:25, 26:30, 31:35, 36:40, 41:50]

@gdementen
Copy link
Contributor

so basically, we can start implementing this.

@alixdamman
Copy link
Collaborator Author

release 0.27?

@gdementen
Copy link
Contributor

If you feel like so. My comment was just meant as "it is no longer blocked by my hesitation for indirect consequences of this".

@gdementen
Copy link
Contributor

I met this need in planet

@alixdamman alixdamman added this to the 0.29 milestone Nov 28, 2017
@alixdamman alixdamman modified the milestones: 0.29, 0.30 Feb 18, 2018
@alixdamman alixdamman modified the milestones: 0.30, 0.31 Jul 18, 2018
@alixdamman alixdamman modified the milestones: 0.31, 0.32 Feb 6, 2019
@alixdamman alixdamman removed this from the 0.32 milestone Aug 1, 2019
@gdementen
Copy link
Contributor

gdementen commented Feb 18, 2021

With the current LArray version, the easiest workaround is to use "union": arr[a["a0:a2"].union(a["a4"])]

@alixdamman
Copy link
Collaborator Author

As I understand, Axis.groups[ ] returns a tuple of groups:

>>> a = Axis('a=a0..a4')
>>> a.groups['a0':'a2', 'a4']
(a['a0':'a2'], a['a4'])

BUT that would generate anonymous groups leading to ugly automatic labels when performing an aggregation.

Furthermore, that doesn't solve the problem when the user actually wants to create a unique group with labels given by mixing a slice and a list of individual labels.

This my understanding of Axis.groups[ ] and I may have miss something.

@gdementen
Copy link
Contributor

Axis.groups[] does not exist yet AFAIK, it was just an idea to support the use case where users want to create several groups on the same axis "quickly" if we make Axis[slice, slice] return a single group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants