Search

M y    b r a i n    h u r t s  !                                           w e                 r e a l l y                 t h i n k                   w h a t                y o u             k n o w

31 August 2010

Using Google Translate from console

If you have ever used Google Translate and wished you could do the same from console, here is a Python script that does just that.

The script will translate words and entire sentences between any language pair known to Google Translate. It will accept both text passed in as shell arguments, as well as data from standard input.

NOTE: This script stopped working after the translation API version 1 was discontinued on December 2011. See the updated script below for a working version.

NOTE: According to http://code.google.com/apis/language/translate/v1/reference.html
Important: Google Translate API v1 was officially deprecated on May 26, 2011; it was shut off completely on December 1, 2011. For text translations, you can use the Google Translate API v2, which is now available as a paid service. For website translations, we encourage you to use the Google Website Translator gadget.
#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import sys
import os

# The google translate API can be found here:
# http://code.google.com/apis/ajaxlanguage/documentation/#Examples

# Language codes are listed here:
#http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray

if len(sys.argv) < 3:
    name = os.path.basename(sys.argv[0])
    print '''
Usage:
    %s en es lovely spam
    %s es en < file.txt

Available language codes are listed here:
http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray
''' % (name,name)
    sys.exit(-1)

## hack to be able to display UTF-8 in Windows console
if sys.platform == "win32":
    ## set utf8 console
    if not sys.stdin.encoding == 'cp65001':
        os.system('chcp 65001 > nul')
    class UniStream(object):
        __slots__= "fileno", "softspace",
        def __init__(self, fileobject):
            self.fileno= fileobject.fileno()
            self.softspace = False
        def write(self, text):
            if isinstance(text, unicode):
                os.write(self.fileno, text.encode("utf_8"))
            else:
                os.write(self.fileno, text)
    sys.stdout= UniStream(sys.stdout)
    sys.stderr= UniStream(sys.stderr)

lang1=sys.argv[1]
lang2=sys.argv[2]
langpair='%s|%s'%(lang1,lang2)

if len(sys.argv) > 3:
    text=' '.join(sys.argv[3:])
else:
    text=sys.stdin.read()

base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
params=urlencode( (('v',1.0),
('q',text),
('langpair',langpair),) )
url=base_url+params
content=urlopen(url).read()
start_idx=content.find('"translatedText":"')+18
translation=content[start_idx:]
end_idx=translation.find('"}, "')
translation=translation[:end_idx]
sys.stdout.write(translation + '\n')

This is the updated script that uses the web API. Should work after December 2011.

#!/usr/bin/env python
import sys
import os
import urllib2
from urllib import urlencode
import cookielib
import re

# The google translate API can be found here (***NOT OPERATIONAL SINCE DECEMBER 2011***):
# http://code.google.com/apis/ajaxlanguage/documentation/#Examples

# Language codes are listed here:
#http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray

if len(sys.argv) < 3:
    name = os.path.basename(sys.argv[0])
    print '''
Usage:
    %s en es lovely spam
    %s es en < file.txt

Available language codes are listed here:
http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray

''' % (name,name)
    sys.exit(-1)

## hack to be able to display UTF-8 in Windows console

def fix_win32_console():
    ## set utf8 console
    if not sys.stdin.encoding == 'cp65001':
        os.system('chcp 65001 > nul')
    class UniStream(object):
        __slots__= "fileno", "softspace",
        def __init__(self, fileobject):
            self.fileno= fileobject.fileno()
            self.softspace = False
        def write(self, text):
            if isinstance(text, unicode):
                os.write(self.fileno, text.encode("utf_8"))
            else:
                os.write(self.fileno, text)
    sys.stdout= UniStream(sys.stdout)
    sys.stderr= UniStream(sys.stderr)

if sys.platform == "win32":
    fix_win32_console()

lang1=sys.argv[1]
lang2=sys.argv[2]

if len(sys.argv) > 3:
    text=' '.join(sys.argv[3:])
else:
    text=sys.stdin.read()

base_url='http://translate.google.com.br/translate_a/t'
# sample browser request
#http://translate.google.com/translate_a/t?client=t&text=col&hl=en&sl=en&tl=es&multires=1&otf=2&ssel=4&tsel=0&sc=1
params=urlencode({'client':'t',
    'text':text,
    'hl':'en',
    'sl':lang1,
    'tl':lang2,
    'otf':2,
    'multires':1,
    'ssel':0,
    'tsel':0,
    'sc':1,
    })

url=base_url + '?' + params

cookiejar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1018.0 Safari/535.19'),
                    ('Referer', 'http://translate.google.com/')
]
response = opener.open(url)
translation=response.read()
matcher = re.search('\[\[\["(?P<human_readable_chunk>[^")]*)', translation)
sys.stdout.write(matcher.group('human_readable_chunk'))



Save the script to a file such as gtrans.py and run it as follows (assuming you have Python in your path):

python gtrans.py en es Nobody expects the Spanish Inquisition

The first two parameters are the language codes. A list of codes known to google translate is available here: http://code.google.com/apis/ajaxlanguage/documentation/reference.html. For some reason, not all of the listed codes are actually accepted, for example, bo for Tibetan

To pipe a text file through the script:

python gtrans.py en es < myfile.txt

It is also possible to enter multi-line text directly from the console. To do so, call the script with the language codes only, i.e:

python gtrans.py en es

Enter your text and use the Enter key to start a new line. When you are done, press CTR+d (on Linux) or CTR+z followed by Enter (on Windows).

Note: On Windows input in other languages than English is not going to work. This is due to poor support of Unicode input in cmd.exe. On Linux international input works fine, provided that the console is UTF-8.

Keep in mind though that google has a limit on the size of text to be translated.

Console Google Translate — curl-based version

As an alternative, here is a bash script which uses curl and sed. Updated to work via Google Translate Web API.

#! /bin/bash

USAGE="Usage: 
       $0 en es Lovely spam!
Some codes: en|fr|de|ru|nl|it|es|ja|la|pl|bo
All language codes:
http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray"

if [ "$#" == "0" ]; then
    echo "$USAGE"
    exit 1
fi

FROM_LNG=$1
TO_LNG=$2

shift 2
QUERY=$*

UA="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803"
URL="http://translate.google.com.br/translate_a/t?client=t&hl=en&sl=$FROM_LNG&tl=$TO_LNG&otf=2&multires=1&ssel=0&tsel=0&sc=1"
curl  --data-urlencode "text=$QUERY" -A $UA -s -g -4 $URL | sed 's/","/\n/g' | sed 's/\]\|\[\|"//g' | sed 's/","/\n/g' | sed 's/,[0-9]*/ /g'

28 August 2010

Linux console online dictionary lookup

I often need to lookup words in various online dictionaries, but in many cases I would prefer to get the results right in the console, instead of having to launch the browser, type in the URL and wait for the page to load.

For example, here is a bash oneliner that translates an English word to Spanish right in the console:

Wordreference English-Spanish: 
curl -j -s -A "Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en) Presto/2.2.0" "http://www.wordreference.com/es/translation.asp?tranword=`echo $* | sed 's/ /%20/g'`" | html2text -utf8 | less -R

Wordreference Spanish to English:
curl -j -s -A "Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en) Presto/2.2.0" "http://www.wordreference.com/es/en/translation.asp?spen=`echo $* | sed 's/ /%20/g'`" | html2text -utf8 | less -R

Merriam Webster Online English Dictionary: 
curl -j -s -A "Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en) Presto/2.2.0" "http://www.merriam-webster.com/dictionary/`echo $* | sed 's/ /%20/g'`" | html2text -utf8 | less -R

Wikipedia (EN):
curl -j -s -A "Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en) Presto/2.2.0" "http://en.wikipedia.org/wiki/`echo $* | sed 's/ /%20/g'`" | html2text -utf8 | less -R 
Usage: Save the required script to a file, make it executable (chmod +x webster), and then type the script name from the console with the word(s) you need translated as arguments.

Instead of creating a separate script for each dictionary it might be more practical to add the scriptlets as aliases in your .bashrc file, e.g:

alias webster="curl -j -s -A "Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en) Presto/2.2.0" "http://www.merriam-webster.com/dictionary/`echo $* | sed 's/ /%20/g'`" | html2text -utf8 | less -R"
Make sure you have curl, html2text , sed, and less installed.

A similar result can be achieved by using some console browser like lynx or w3m.

Install w3m:

sudo apt-get install w3m # for Ubuntu
sudo pacman -S w3m       # Archlinux

Create a wrapper script which includes the URL:

echo 'w3m "http://www.wordreference.com/es/translation.asp?tranword=$*"' > ~/bin/en2es
chmod +x ~/bin/en2es

Now the script can be run like this:

en2es word of mouth

The advantage in this case is that you can use all the standard browser features, like clicking links, etc.

30 April 2010

Bash oneliners or helpful aliases for your .bashrc

Useful bash oneliners that might or might not be worth including as aliases in your ~/.bashrc file.

Bash Oneliners
command
action
grep -v "^#" $* | grep -v "^$"
print file contents excluding comments starting with # and empty lines
netstat -tpauln
print open TCP connections *and* application names
find . -maxdepth 1  -type d  -exec du -sh {} \; | sort -hr
print sorted directory sizes starting with the biggest
find . -maxdepth 1 -mindepth 1 -type d -print0 | xargs -0 du -sh | sort -hr
same as above, but supposedly more efficient for lots of files
help test | less
does not do much work on itself but it is good reading
[ -n $SOMEVAR ] &&  echo "IS_NOT_EMPTY" || echo "IS_EMPTY"
execute first command on empty string, or execute second command otherwise (the Python way)
mkdir $(echo {01..22}spam)
create folders 01spam, 02spam, 03spam, etc.
cp filename{,.bak}
create backup of file
!!
rerun previous command
sudo !!
rerun previous command as root
python -m SimpleHTTPServer
run web server for current dir on port 8000 (Python 2.x version)
python -m http.server
run web server for current dir (Python 3 version)
:w !sudo tee %
write file in vim as root
ssh-copy-id remote-machine
copy rsa/dsa key to remote machine for public authentication
ffmpeg -f x11grab -s wxga -r 25 -i :0.0 -sameq /tmp/out.mpg
record current desktop to mpeg file
echo -e ${PATH//\:/\\n}
string replace with bash (the example replaces colon with end-of-line
curl -O http://rss.timegenie.com/forex.xml
download file to local file forex.xml
tr -dc ' -~' < /dev/urandom | head -c 20
generate random string
echo $(($RANDOM % 100))
generate random number in range
sudo find . -user root -exec chown spamneggs {} \;
find all files owned by root and change owner to spamneggs
function mcd {
  mkdir ${1} && cd ${1}  
}
a ~/.bashrc shortcut to make a directory and move into it.
function catw {
cat `which "${1}"`  
}
a ~/.bashrc shortcut for typing cat `which script`
function vimw {
vim `which "${1}"`  
}
a ~/.bashrc shortcut for typing vim `which script`; you get the idea...
See also:

15 April 2010

Running a command recursively on all files in a directory tree

Windows solution
Here is a script which calls mplayer on every file in a directory including subfolders:

The script will work on dirs containing spaces and non-ascii characters since all file names are converted to short dos names (via the `~s` modifier).
set mplayer=z:\path_to_mplayer\mplayer.exe
  for /f "delims=" %%f in ('dir /b /s "%cd%"') do (%mplayer% "%%~sf")
Linux bash solution
The bash shell solution is much easier to read than the windows one:
#!/bin/sh
cwd=`pwd`
for f in `find $cwd -type f -iname "*.mp3" `; do
    mplayer $f
done

A Better Solution

An alternative solution for both Windows and Linux would be to create a playlist file first, and then run a single mplayer instance with the -playlist switch:
Linux bash
#!/bin/sh
find $pwd -type f -iname "*.mp3" > /tmp/allfiles.m3u
mplayer -playlist /tmp/allfiles.m3u
Windows Batch
set mplayer=z:\path_to_mplayer\mplayer.exe
  set playlist=%TEMP%/mp3playlist.m3u
  dir /b /s "%cd%" | find /I ".mp3" > "%playlist%"
  %mplayer% -playlist "%playlist%"
Shortest solution
mplayer $(find . -iname "*.mp3")