Posts

How to fix the Corsair mouse scroll wheel not working or erratic behavior

Image
Corsair Sabre Pro Gaming Mouse partial disassembly (model # RGP0091 and perhaps similar models with similar problems) How to open the Corsair Sabre Pro Champion Series mouse in 7 steps: OPTIONAL:  In the front of the mouse, carefully but with some force lift up vertically each mouse button until it clicks remaining upward which will help expose visibility of the locations of 2 non-removable clips on opposite sides of the mouse usb cable that are part of the bottom half of assembly. Insert a screwdriver or plastic wedge tool between the front and bottom assemblies to gently pry apart the top and bottom halves while also pushing forward on 1st clip location to push it back just enough to make clearance for top assembly to gap apart with some force from inserted 2nd screwdriver or plastic wedge tool to assist in separating top and bottom half assembly.  WARNING: The clips are plastic and part of the bottom assembly itself.  They do not remove.  They only need to be pushed inward to allow

Display HTML content from a URL within OpenRefine using IFRAMES

Image
OpenRefine has a powerful feature with its Custom Tabular Exporter. Which can be used to fetch and preview URLs to review their HTML content in your web browser.  Handy for doing small reconciling tasks at times. :) 1. Create an empty <iframe> element with the src URL as shown (you can also add any iframe attribute options you need like width, height, etc - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe ):  2. Choose EXPORT --> Custom Tabular Exporter and select the URL column(s) and any others you may wish to view alongside the HTML content: 3. Choose Download tab and select HTML Table and then click Preview button.  This should launch another browser tab window. 4. The HTML Table is rendered with the iframe content being fetched as shown:

Open Source Tools for Data Mining

I have been asked what my favorite Open Source Tools for Data Mining with Statistics support are.  In no particular order, other than recall, here they are.  Feel free to comment on these or any others you like that fall into this same category and the reasons why : R - http://r-project.org/ GGobi - http://www.ggobi.org  - a visualization program for R Mondrain - http://stats.math.uni-augsburg.de/Mondrian  - a visualization program for R (more biased for category work) KNIME - http://www.knime.org Orange - http://orange.biolab.si Tanagra - http://eric.univ-lyon2.fr/~ricco/tanagra Weka - http://www.cs.waikato.ac.nz/ml/weka Yale / RapidMiner - http://rapid-i.com Enjoy! -Thad

Review of new parseHtml() Function in Google Refine

Image
Last post, I mentioned how Beautiful Soup is an elegant way to parse HTML with Google Refine. Well, it just got better thanks to Iain Sproat's latest commit to Google Refine (and his Java skills are getting better all the time!).  If you pull down trunk and build, you'll see that he has integrated the jsoup.org java library that leverages upon Beautiful Soup.  Iain has done a great job of pushing the jsoup Element stack right up to GREL (Google Refine Expression Language) for concise usage.  I love it ! Using jsoup's simple selector syntax, I was able to easily parse out company websites from LinkedIn's public pages.  The example below says select the div called data-table that contains the term Website and return the 2nd <a href> htmlText.  In Refine, the ordering starts at [0], so in this case [1] gives the 2nd href link.  The jsoup.org website's cookbook  and the use of selector-syntax is a great start to begin learning more. Enjoy ! -Thad Click

Google Refine and easier HTML parsing

Image
After spending most of the day BANGING my head on using Regex and GREL to handle HTML parsing. I thought, there MUST be a better way to parse HTML !!! I know several of you who have thought the same thing.  So, I took the time today to find out where and how this could be improved directly in Google Refine or with an extension. It just so happens Google Refine already has a wonderful extension with another language itself: Jython Enter BeautifulSoup  (love the name?) a Jython library for powerful HTML parsing and entity extraction. Here's more on how to use it easily within Refine: http://code.google.com/p/google-refine/wiki/StrippingHTML Enjoy! -Thad