-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New lesson improvement #173
New lesson improvement #173
Conversation
@pitviper6 thanks for all your work on these issues! Happy to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is really good - I love the split of the transformations episode into the finer grained episodes - it works really well.
I've made a number of comments - I've been a bit picky (sorry) - but overall I want to say this is brilliant. Thanks for putting in all this effort.
>5. Ensure the first row is used to create the column headings by checking the box `Parse next 1 line(s) as column headers` | ||
>6. Make sure the `Parse cell text into numbers, dates, ...` box is not checked, so OpenRefine doesn't try to automatically detect numbers | ||
>7. Once you are happy click the `Create Project >>` button at the top right of the screen. This will create the project and open it for you. Projects are saved as you work on them, there is no need to save copies as you go along. | ||
>1. Locate the file which you have downloaded called `doaj-article-sample.csv` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for changing the steps here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That....is super weird. It looked fine in jekyll before I did the pull request. I'll fix this lesson. It's been a while, what's the best procedure for making changes in a pull request? Do the changes locally and then do a re-pull?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worked on this episode a few months ago but wasn't worried about it reverting back, it's easy enough to update ;-) I've updated a PR before by opening the file in a new tab and editing directly in github which allows you to commit at bottom of page - it then appears in the PR. To do it locally try this https://stackoverflow.com/questions/9790448/how-to-update-a-pull-request-from-forked-repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pitviper6 I did wonder if something had gone askew with the version somewhere. To update the PR you just commit more changes to the branch in your fork, and those commits will automatically appear in this PR.
@ccronje if you think we can revert any changes in this PR that have occurred by accident, then I'm OK with merging this then tidying - I just don't want to miss anything.
### Going Further |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind removing the exercise here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't have been removed (see above - things somehow got weird between the last time I checked the layout in jekyll and the pull request). When I fix this I'll re-add the exercise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even know what I thought I was correcting or changing in lesson #2. Let's not merge that one!
_episodes/03-working-with-data.md
Outdated
@@ -29,23 +29,23 @@ OpenRefine only displays a limited number of rows of data at one time. You can a | |||
Most options to work with data in OpenRefine are accessed from drop down menus at the top of the data columns. When you select an option in a particular column (e.g. to make a change to the data), it will affect all the cells in that column. If you want to make changes across several columns, you will need to do this one column at a time. | |||
|
|||
## Rows and Records | |||
OpenRefine has two modes of viewing data 'Rows' and 'Records'. At the moment we are in Rows mode, where each row represents a single record in the data set - in this case, an article. In Records mode, OpenRefine can link together multiple rows as belonging to the same Record. | |||
OpenRefine has two modes of viewing data `Rows` and `Records`. At the moment we are in Rows mode, where each row represents a single record in the data set - in this case, an article. In Records mode, OpenRefine can link together multiple rows as belonging to the same Record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the back-tick formatting should be used here because this is about the concepts of Row and Record, whereas the back-tick formatting is usually used to indicate some link or function in the UI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - I kept going back and forth with some of these and I don't think I was quite consistent.
_episodes/03-working-with-data.md
Outdated
|
||
### Choosing a good separator | ||
|
||
The value that separates multi-valued cells is called a separator or delimiter. Choosing a good | ||
separator is important. In the examples, we've seen the pipe character (\|) has been used. | ||
separator is important. In the examples, we've seen the pipe character ("\|") has been used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need parentheses and inverted commas here, happy with either one or the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the parens, since there are so many inverted commas and apostrophes lurking around, so I'll change them all to parens across the lesssons.
_episodes/03-working-with-data.md
Outdated
|
||
Choosing the wrong separator can lead to problems. Consider the following multi-valued Author example. | ||
with a pipe as a separator. | ||
``` | ||
Jones, Andrew | Davis, S. | ||
``` | ||
|
||
When we tell OpenRefine to split this cell on the pipe (\|), we will get the following two authors each in their own cell since there is a single pipe character separating them. | ||
When we tell OpenRefine to split this cell on the pipe ("\|"), we will get the following two authors each in their own cell since there is a single pipe character separating them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need parentheses and inverted commas here, happy with either one or the other
_episodes/05-clustering.md
Outdated
|
||
The 'Clusters' are created automatically according to an algorithm. OpenRefine supports a number of different clustering algorithms - some experimentation may be required to see which clustering algorithm works best with any particular set of data, and you may find that using different algorithms highlights different clusters. | ||
|
||
For more information on the methods used to create Clusters, see [https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth](https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth) | ||
|
||
For each cluster, you have the option of 'merging' the values together - that is, replace the various inconsistent values with a single consistent value. By default, OpenRefine uses the most common value in the cluster as the new value, but you can select another value by clicking the value itself, or you can simply type the desired value into the 'New Cell Value' box. | ||
For each cluster, you have the option of 'merging' the values together - that is, replace them with a single consistent value. By default, OpenRefine uses the most common value in the cluster as the new value, but you can select another value by clicking the value itself, or you can simply type the desired value into the 'New Cell Value' box. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence was changed for clarity following tutor feedback, I think we need to leave it as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I didn't edit any of the language beyond fixing typos and I did do an upstream grab of the repo to my local before editing. It seems like I ended up working with earlier files? Eeep, I hope not.
_episodes/05-clustering.md
Outdated
>* Try changing the clustering method being used - which ones work well? | ||
{: .challenge} | ||
|
||
>1. Split out the author names into individual cells using `Edit cells -> Split multi-valued cells`, using the pipe "\|" character as the separator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another mention of the pipe character - we should be consistent in how we present this across all episodes
>4. On the Date column dropdown select ```Edit column->Add column based on this column```. Using this function you can create a new column, while preserving the old column | ||
>5. In the 'New column name' type "Formatted Date" | ||
>6. In the 'Expression' box type the GREL expression ```value.toString("dd MMMM yyyy")``` | ||
{: .checklist} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has the final bit of this exercise been removed? Has it gone somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - I split this exercise between the boolean lesson and the arrays lesson. So the missing bit is in the next lesson. Does that make sense to you guys?
_episodes/13-looking-up-data.md
Outdated
|
||
Reconciliation services can be more sophisticated and often quicker than using the method described above to retrieve data from a URL. However, to use the ‘Reconciliation’ function in OpenRefine requires the external resource to support the necessary service for OpenRefine to work with, which means unless the service you wish to use supports such a service you cannot use the ‘Reconciliation’ approach. | ||
|
||
There are a few services where you can find an OpenRefine Reconciliation option available. For example Wikidata has a reconciliation service at [https://tools.wmflabs.org/openrefine-wikidata/en/api](https://tools.wmflabs.org/openrefine-wikidata/en/api). | ||
There are a few services where you can find an OpenRefine Reconciliation option available. For example WikiData has a (fledgling) reconciliation service at [https://tools.wmflabs.org/wikidata-reconcile/](https://tools.wmflabs.org/wikidata-reconcile/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not say 'fledgling' here - this will change over time and I think it would be better to avoid something we need to update or make a judgement call on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not my edit! I agree, I'd remove fledgling and will do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing is the URL - I think the best URL to use is now https://tools.wmflabs.org/openrefine-wikidata/ (this is a new change - I'm just flagging here because I just noticed it!)
_episodes/13-looking-up-data.md
Outdated
@@ -71,11 +71,11 @@ The next exercise demonstrates this two stage process in full. | |||
{: .challenge} | |||
|
|||
## Reconciliation services | |||
Reconciliation services allow you to lookup terms from your data in OpenRefine against external services, and use values from the external services in your data. The official wiki provides [detailed information about this feature](https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation). | |||
Reconciliation services allow you to lookup terms from your data in OpenRefine against external services, and use values from the external services in your data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind removing this link here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I didn't remove anything conciously. Again, I wonder if the upstream I did didn't work, although it I got all the right messages.
@ccronje @ostephens AAAARGH. Years ago, I created a github user account that I created for a class and promptly forgot about (Juliane666). Now it seems to be messing me up completely. I'd forgotten about it and now it's popping up and I really don't understand why, and I'm afraid to delete it.....So that's why there are these rogue commits, and I don't know why my terminal is suddenly using that account instead of the pitviper6 name I've been using forever. |
re-added the word 'publisher'
replaced text that was accidentally deleted.
Removed 'fledgling', updated URLs.
Ok, I've made all the changes - should I commence merging or do you guys want to take one more look? |
I'll have a quick look |
In terms of commits on the wrong username - possibly this? https://help.github.com/articles/why-are-my-commits-linked-to-the-wrong-user/ |
yeah, I do think it has something to do with me re-activating my Harvard
email and working from a Harvard machine. That's probably why it's popped
up again.
…On Sun, Dec 10, 2017 at 10:55 AM, Owen Stephens ***@***.***> wrote:
In terms of commits on the wrong username - possibly this?
https://help.github.com/articles/why-are-my-commits-
linked-to-the-wrong-user/
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#173 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APGpfqlIBriqr6c6hjhBbiZKHkmYMM-Qks5s-_70gaJpZM4Qq9q_>
.
|
Sorry I'm a bit lost in terms of what will merge from the two GitHub accounts. If you both are happy to merge let's do it and review. |
OK - let's merge, then sort out any issues |
I've broken up Episode 7 into more granular episodes and have renumbered the episodes accordingly (there are now 13 episodes in the OR lesson)
I've kept the old Using Transformations doc in case we need the original - don't know where we should put it?
I've added a bit more time to the Transformation lesson estimates. It was originally 20 minute teaching, 40 minutes exercises (or checklists). As I've rewritten them, I added 5 minutes to each total.
I've pulled out the lesson on Exports so it stands alone but it needs a checklist or exercise.
I've transformed most of the exercises into checklists for all of the OpenRefine episodes.
I've also created grayed boxes for any text that is a button or menu location for all of the OpenRefine episodes.