Skip to content →

Tag: google

nothing beats the command line

Over
the last couple of days I’ve been experimenting a bit with different
backup methods. To begin, I did try out ExecutiveSync and its
successor You Syncronize but they are very, very
slow. Not only did the first synchronizing of a 0.5 Gb Folder between
two computers over our Airport-network took over 2.5 hrs, but also on
subsequent syncs the checking of the database seems to last forever.

So I turned to the fink project
again and did find two interesting packages : wget . GNU Wget is a free network utility to
retrieve files from the World Wide Web using HTTP and FTP, so one way
to backup a folder would be to put it in the Sites folder and
mirror it over the network using wget. I did’t check this out in
great details (did a small test to see it working but I assume it will
be slow for large folders). The other one is rsync It uses the “rsync algorithm” which
provides a very fast method for remote files into sync. It does this by
sending just the differences in the files across the link, without
requiring that both sets of files are present at one of the ends of the
link beforehand. This seems to be precisely what I wanted to do and
after a google for ‘rsync OS X’ I arrived at the RsyncX package which is an implementation of rsync
with HFS support and configuration through a command line (Terminal) or
graphical user interface. I downloaded this package and the GUI seems to
be placed in the Applications/Utilities and tried it out by
filling out the Source and Local Folders and pressing the synchronize
button. Not much progress was reported but the Activity Monitor
showed that it was using up all of the CPU so I was patient for over an
hour and then looked for the Network Activity in the Activity
Monitor
and virtually no packets were going in or out, so I killed
RsyncX. I am sure I did something wrong but rather than trying to
get it working, I tried the command-line rsync-command I
downloaded from Fink. After a few false attempts I
typed

/sw/bin/rsync -a -e ssh
iMatrixLieven.local:/Users/lieven/Documents
/Users/lieven/docsLieven

and suddenly the packets were flying
happily over the network at 250 Kb/sec, so it took me only half an hour
to get a first synchronization done and subsequent changes are added in
no time! Afterwards I discovered that rsync is included in the
standard OS X Developers Tools as RsyncX seems to have replaced
it to rsync_orig and installed a new (quite large) rsync
in /usr/bin. Maybe my problems with RsyncX were caused
because I have /sw/bin earlier in my $PATH than
/usr/bin but verifying this will have to await another day. For
the moment, I’m happy to have a quick syncronizing tool available and
Real Madrid is playing on the TV…

Leave a Comment

GAP on OS X


GAP the Groups, Algorithms, and Programming-tool
(developed by two groups, one in St. Andrews, the other in Aachen) is
the package if you want to work with (finite or finitely
presented) groups, but it has also some routines for algebras, fields,
division algebras, Lie algebras and the like. For years now it is
available on MacClassic but since the last clean install of my
computer I removed it as I was waiting for a Mac OS X-port to be
distributed soon. From time to time I checked the webpage at gap-system.org
but it seems that no one cared for OS X. For my “The book of
points”
-project I need a system to make lots of examples so perhaps
one could just as well install the UNIX-version. Fortunately, I did a
last desperate Google on GAP OS X which brought me to the
Aachen-pages of the GAP-group where one seems to be more Macintosh
minded. The relevant page is the further notes for OS X on the
GAP-installation for UNIX-page. Here is what I did to get GAP running
under OS X. First go to the download page (btw. this page has
version 4.4 whereas St-Andrews is still distributing 4.3) and download
the
files

gap4r4.tar.gz,packages-2004_01_27-11_37_UTC.tar.gz,xtom1r1.tar
.gz

This will give you three tar-files on your Desktop. Fire
up the Terminal and make a new directory /usr/local/lib if
it doesn’t exist yet. Then, go to your Desktop folder and do

sudo
cp gap4r4.tar /usr/local/lib sudo cp xtom1r1.tar /usr/local/lib cd
/usr/local/lib sudo tar xvf gap4r4.tar sudo tar xvf
xtom1r1.tar

Then return to your Desktop Folder and copy the
remaining tar-file in the /usr/local/lib/gap4r4/pkg-folder which
is created by untarring the former two files and untar it as above.
Then, it is time to compile everything (assuming you have installed the
Developer’s tools) and there is one magic OS X-command which will
speedup GAP by 20%. Here is what to do

cd
/usr/local/lib/gap4r4 sudo ./configure sudo make COPTS="-fast
-mcpu=7450"

and everything will compile nicely. If you
are so lucky as to have a G5-system, you should replace the last command
by sudo make COPTS=”-03″. Finally, get everything in the right
place

cd /usr/local/lib/gap4r4/bin sudo cp gap.sh
/usr/local/bin/gap

and if /usr/local/bin is in
your $PATH then typing gap at the command line will give
you the opening GAP-banner :

Leave a Comment

google spammers


In the GoogleMatrix I tried to understand the concept
of the PageRank algorithm that Google uses to list pages according to
their \’importance\’. So, if you want your webpage to come out first in
a certain search, you have to increase your PageRank-value (which
normally is a measure of webpages linking to your page) artificially. A
method to achieve this is by link spamming, that is if page A is
to webpage of which you want to increase the PageRank value, take a page
B (either under your control or that of a friend webmaster) and add a
dummy link page B -> page A. To find out the effect of this on the
PageRank and how the second eigenvalue of the GoogleMatrix is able to
detect such constructs let us set up a micro-web consisting of
just 3 pages with links 1->2 and 1->3. The corresponding GoogleMatrix
(with c=0.85 and v=(1/3,1/3,1/3) is

1/3   1/20   1/20 1/3   9/10
 1/20 1/3   1/20   9/10

which has eigenvalues 1,0.85 and 0.28.
The eigenvector with eigenvalue 1 (the PageRank) is equal to (0.15,1,1)
so page 2 and page 3 are equally important to Google and if we scale
PageRank such that it adds up to 100% over all pages, the relative
importance values are 6,9%,46,5% and 46,5%. In this case the eigenvector
corresponding to the second eigenvalue 0.85 is (0,-1,1) and hence
detects the two leaf-nodes. Now, assume the owner of page 2 sets up a
link spam by creating page 4 and linking 4->3, then the corresponding
GoogleMatrix (with v=(1/4,1/4,1/4,1/4)) is

77/240   3/80   3/80
3/80 77/240   71/80   3/80   37/80 77/240   3/80   71/80
3/80  3/80   3/80   3/80   37/80

which has eigenvalues
1,0.85,0.425 and 0.283. The PageRank eigenvector with eigenvalue 1 is
in this case is (0.8,8.18,5.35,1) or in relative importance % we have
(4.9%,50.1%,32.7%,6.1%) and we see that the spammer achieved his/her
goal. The eigenvector corresponding to the second eigenvalue is
(0,-1,1,0) which again gives the leaf-nodes and the eigenvector of the
third eigenvalue is (0,-1,0,1) and detects the spam-construct.

Leave a Comment