Skip to content

Don't sort by sentiment in sentiment_by#13

Open
vspinu wants to merge 1 commit into
trinker:masterfrom
vspinu:remove_sort
Open

Don't sort by sentiment in sentiment_by#13
vspinu wants to merge 1 commit into
trinker:masterfrom
vspinu:remove_sort

Conversation

@vspinu

@vspinu vspinu commented Apr 13, 2016

Copy link
Copy Markdown
Contributor

Usually grouping operators preserve the order of the groups. In this sense the behavior of sentiment_by is non-intuitive. Especially with data.table by would better have the same semantics as in data.table.

While sorting by sentiment is potentially useful in some contexts, there is virtually no advantage for it with current plotting functionality. Plotting by grouping values is probably more meaningful in majority of the cases.

If they so desire, users can always sort by sentiment with a simple setkey. Sorting by keys is more verbose and results in unnecessary groups names replication.

For these reasons I would propose to remove this "feature".

@trinker

trinker commented May 7, 2016

Copy link
Copy Markdown
Owner

Thanks for the PR and interest in sentimentr...

Still debating on this. I use sentimentr for analysis and usually care most about the order of the output. Just like with plots, tables shine when they are ordered by the variable of interest. I think you have a specific reason of why it's screwing something up for you but am unsure what that is. I would appreciate a stronger argument for changing the behavior. I'd be more inclined to add this as an argument but retain the default behavior.

@vspinu

vspinu commented May 8, 2016

Copy link
Copy Markdown
Contributor Author

I don't have a "stronger" argument than the one I have already said. If I do processing by a factor variable G, then I expect my output to be ordered by original order of data or by the order of the grouped factor. That's what all group_by processor do in R. The fact that the output is sorted by something else is unexpected unexpected. Groups are almost certainly have a meaningful order so, I think, preserving that order is desirable independently of the context.

In my case the order is given by time. I have thousands sentences which are grouped by messages which in turn are ordered in time. Sorting it by sentiment doesn't make much sense AFAIC.

I'd be more inclined to add this as an argument but retain the default behavior.

It's up to you to decide of course, but I think sorting is an 'extra" feature. It adds up to the default computation, so it should an extra optional option. Otherwise users would need to revert that sorting computation.

@trinker

trinker commented Sep 14, 2016

Copy link
Copy Markdown
Owner

I have been bitten by this twice this week. You're right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants