Posts about computer science

Artificial Intelligence Finds Ancient Indus Script Matches Spoken Language

Artificial Intelligence Cracks 4,000-Year-Old Mystery by Brandon Keim

An ancient script that’s defied generations of archaeologists has yielded some of its secrets to artificially intelligent computers.

The Indus script, used between 2,600 and 1,900 B.C. in what is now eastern Pakistan and northwest India, belonged to a civilization as sophisticated as its Mesopotamian and Egyptian contemporaries. However, it left fewer linguistic remains. Archaeologists have uncovered about 1,500 unique inscriptions from fragments of pottery, tablets and seals. The longest inscription is just 27 signs long.

They fed the program sequences of four spoken languages: ancient Sumerian, Sanskrit and Old Tamil, as well as modern English. Then they gave it samples of four non-spoken communication systems: human DNA, Fortran, bacterial protein sequences and an artificial language.

The program calculated the level of order present in each language. Non-spoken languages were either highly ordered, with symbols and structures following each other in unvarying ways, or utterly chaotic. Spoken languages fell in the middle.

When they seeded the program with fragments of Indus script, it returned with grammatical rules based on patterns of symbol arrangement. These proved to be moderately ordered, just like spoken languages.

Related: The Rush to Save Timbuktu’s Crumbling ManuscriptsThe Mystery of the Voynich ManuscriptAztec Math

Keeping Out Technology Workers is not a Good Economic Strategy

The barriers between countries, related to jobs, are decreasing. Jobs are more international today than 20 years ago and that trend will continue. People are going to move to different countries to do jobs (especially in science, engineering and advanced technology). The USA has a good market on those jobs (for many reasons). But there is nothing that requires those jobs to be in the USA.

The biggest impact of the USA turning away great scientists and engineers will be that they go to work outside the USA and increase the speed at which the USA loses its place as the leading location for science, engineering and technology work. This is no longer the 1960′s. Back then those turned away by the USA had trouble finding work elsewhere that could compete with the work done in the USA. If the USA wants to isolate ourselves (with 5% of the population) from a fairly open global science and engineering job market, other countries will step in (they already are trying, realizing what a huge economic benefit doing so provides).

Those other countries will be able to put together great centers of science and engineering innovation. Those areas will create great companies that create great jobs. I can understand wanting this to be 1960, but wanting it doesn’t make it happen.

You could go even further and shut off science and engineering students access to USA universities (which are the best in the world). That would put a crimp in plans for a very short while. Soon many professors would move to foreign schools. The foreign schools would need those professors, and offer a great deal of pay. And those professors would need jobs as their schools laid off professors as students disappeared. Granted the best schools and best professors could stay in the USA, but plenty of very good ones would leave.

I just don’t think the idea of closing off the companies in the USA from using foreign workers will work. We are lucky now that, for several reasons, it is still easiest to move people from Germany, India, Korea, Mexico and Brazil all to the USA to work on advanced technology projects. The advantage today however, is much much smaller than it was 30 years ago. Today just moving all those people to some other location, say Singapore, England, Canada or China will work pretty well (and 5 years from now will work much better in whatever locations start to emerge as the leading alternative sites). Making the alternative of setting up centers of excellence outside the USA more appealing is not a good strategy for those in the USA wanting science, engineering and computer programming jobs. We should instead do what we can to encourage more companies in the USA that are centralizing technology excellence in the USA.

Comment on Reddit discussion.

Related: Science and Engineering in Global EconomicsGlobal technology job economyCountries Should Encourage Immigration of Technology WorkersThe Software Developer Labor MarketWhat Graduates Should Know About an IT CareerRelative Engineering Economic PositionsChina’s Technology Savvy LeadershipEducation, Entrepreneurship and ImmigrationThe Future is EngineeringGlobal Technology Leadership

Google Summer of Code 2009

Google Summer of Code is a global program that offers student developers stipends to write code for various open source software projects. Google funds the program with $4,500 for each student (and pays the mentor organization $500). Google works with several open source, free software, and technology-related groups to identify and fund projects over a three month period.

Since its inception in 2005, the program has provided opportunities for nearly 2500 students, from nearly 100 countries. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all.

Google funded approximately 400 student projects in 2005, 600 in 2006, 900 in 2007 and 1125 in 2008 and will be funding approximately 1,000 student projects in 2009.

Applying for the program is only allowed from March 23rd through April 3rd. Still a short period of time but in previous years they have only taken them for one week. Organizations hosting students include: Creative Commons, MySQL, Debian, The Electronic Frontier Foundation/The Tor Project, haskell.org, Grameen Foundation USA, National Center for Supercomputing Applications, Ruby on Rails, Wikimedia Foundation and WordPress. See the full list of organizations and link to descriptions of the projects each organization offers.

See the externs.com internship directory (another curiouscat.com ltd. site) for more opportunities including those in science and engineering.

Related: Google Summer of Code Projects 2008posts on fellowships and scholarshipsLarry Page on How to Change the Worldcomic on programmersInterview of Steve Wozniak

Data Analysts Captivated by R’s Power

Data Analysts Captivated by R’s Power

data mining has entered a golden age, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models. Companies as diverse as Google, Pfizer, Merck, Bank of America, the InterContinental Hotels Group and Shell use it.

Close to 1,600 different packages reside on just one of the many Web sites devoted to R, and the number of packages has grown exponentially. One package, called BiodiversityR, offers a graphical interface aimed at making calculations of environmental trends easier.

Another package, called Emu, analyzes speech patterns, while GenABEL is used to study the human genome. The financial services community has demonstrated a particular affinity for R; dozens of packages exist for derivatives analysis alone. “The great beauty of R is that you can modify it to do all sorts of things,” said Hal Varian, chief economist at Google. “And you have a lot of prepackaged stuff that’s already available, so you’re standing on the shoulders of giants.”

R first appeared in 1996, when the statistics professors Ross Ihaka and Robert Gentleman of the University of Auckland in New Zealand released the code as a free software package. According to them, the notion of devising something like R sprang up during a hallway conversation. They both wanted technology better suited for their statistics students, who needed to analyze data and produce graphical models of the information. Most comparable software had been designed by computer scientists and proved hard to use.

R is another example of great, free, open source software. See R packages for Statistics for Experimenters.

via: R in the news

Related: Mistakes in Experimental Design and InterpretationData Based Decision Making at GoogleFreeware Math ProgramsHow Large Quantities of Information Change Everything

So What are Genetic Algorithms?

Genetic Algorithms: Cool Name and Damn Simple is a very nice explanation with python code of genetic algorithms.

What Can Genetic Algorithms Do?
In a word, genetic algorithms optimize. They can find better answers to a question, but not solve new questions. Given the definition of a car, they might create a better car, but they’ll never give you an airplane.

For each generation we’ll take a portion of the best performing individuals as judged by our fitness function. These high-performers will be the parents of the next generation.

We’ll also randomly select some lesser performing individuals to be parents, because we want to promote genetic diversity. Abandoning the metaphor, one of the dangers of optimization algorithms is getting stuck at a local maximum and consequently being unable to find the real maximum. By including some individuals who are not performing as well, we decrease our likelihood of getting stuck.

Related: DNA Seen Through the Eyes of a CoderEvolutionary DesignAlgorithmic Self-AssemblyThe Chip That Designs Itself

Solving the Toughest Problems in Computer Science

Software Breakthroughs: Solving the Toughest Problems in Computer Science, 2004:

Bill Gates’ talk at MIT provided an optimistic view of the next generation of computer science, now that the “rough draft” is done. Gates finds a paradox today in that computer science is poised to transform work and home life, “but people’s excitement level is not as high as it was five years ago during the Internet bubble.” Because most sectors of the computer industry—from microchips to storage, displays to wireless connectivity— continuously improve in performance, Gates predicts a flood of new products and applications. He sported a wristwatch that receives data wirelessly, as well as keeps its user on schedule. Gates describes “rich, new peripherals” such as ultra-wideband digital cameras and he demonstrates software that allows pictures to be archived using a 3D visual interface with a built-in time, date, and keyword database. He says that computer science is merging with and making over such fields as astronomy and biology, by unifying vast, unwieldy data collections into easily navigable libraries. And Gates appears confident that technological breakthroughs will ultimately resolve urgent problems of computer and network security.

Related: Bill Gates Interview from 1993Donald Knuth – Computer ScientistOpen Source: The Scientific Model Applied to ProgrammingInternship with Bill Gates

Rumors of Software Engineering’s Death are Greatly Exaggerated

Rumors of Software Engineering’s Death are Greatly Exaggerated by Steve McConnell

Indeed, one of the hallmarks of engineering as opposed to science is that engineers will work with materials whose properties are not entirely understood, and they’ll factor in safety margins until the science comes along later and allows more precision in the engineer’s use of those materials.

Software engineering already has been defined as engineering, we have an international reference standard for that definition, the field’s two largest professional bodies have jointly adopted a professional code of conduct for software engineers, we have accreditation standards for university programs in software engineering, we have university numerous programs that have already been accredited, and several countries are licensing professional engineers in software.

Related: Who Killed the Software Engineer?Is Computer Science a Science?What Ails India’s Software Engineers?Federal Circuit Decides Software No Longer PatentableA Career in Computer Programming

How We Found the Missing Memristor

How We Found the Missing Memristor By R. Stanley Williams

For nearly 150 years, the known fundamental passive circuit elements were limited to the capacitor (discovered in 1745), the resistor (1827), and the inductor (1831). Then, in a brilliant but underappreciated 1971 paper, Leon Chua, a professor of electrical engineering at the University of California, Berkeley, predicted the existence of a fourth fundamental device, which he called a memristor. He proved that memristor behavior could not be duplicated by any circuit built using only the other three elements, which is why the memristor is truly fundamental.

the memristor’s potential goes far beyond instant-on computers to embrace one of the grandest technology challenges: mimicking the functions of a brain. Within a decade, memristors could let us emulate, instead of merely simulate, networks of neurons and synapses. Many research groups have been working toward a brain in silico: IBM’s Blue Brain project, Howard Hughes Medical Institute’s Janelia Farm, and Harvard’s Center for Brain Science are just three. However, even a mouse brain simulation in real time involves solving an astronomical number of coupled partial differential equations. A digital computer capable of coping with this staggering workload would need to be the size of a small city, and powering it would require several dedicated nuclear power plants.

Related: Demystifying the MemristorUnderstanding Computers and the Internet10 Science Facts You Should Know

The Chip That Designs Itself

The chip that designs itself by Clive Davidson , 1998

Adrian Thompson, who works at the university’s Centre for Computational Neuroscience and Robotics, came up with the idea of self-designing circuits while thinking about building neural network chips. A graduate in microelectronics, he joined the centre four years ago to pursue a PhD in neural networks and robotics.

To get the experiment started, he created an initial population of 50 random circuit designs coded as binary strings. The genetic algorithm, running on a standard PC, downloaded each design to the Field Programmable Gate Arrays (FPGA) and tested it with the two tones generated by the PC’s sound card. At first there was almost no evidence of any ability to discriminate between the two tones, so the genetic algorithm simply selected circuits which did not appear to behave entirely randomly. The fittest circuit in the first generation was one that output a steady five-volt signal no matter which tone it heard.

By generation 220 there was some sign of improvement. The fittest circuit could produce an output that mimicked the input – wave forms that corresponded to the 1KHz or 10KHz tones – but not a steady zero or five-volt output.

By generation 650, some evolved circuits gave a steady output to one tone but not the other. It took almost another 1,000 generations to find circuits that could give approximately the right output and another 1,000 to get accurate results. However, there were still some glitches in the results and it took until generation 4,100 for these to disappear. The genetic algorithm was allowed to run for a further 1,000 generations but there were no further changes.

See Adrian Thompson’s home page (Department of Informatics, University of Sussex) for more on evolutionary electronics. Such as Scrubbing away transients and Jiggling around the permanent: Long survival of FPGA systems through evolutionary self-repair:

Mission operation is never interrupted. The repair circuitry is sufficiently small that a pair could mutually repair each other. A minimal evolutionary algorithm is used during permanent fault self-repair. Reliability analysis of the studied case shows the system has a 0.99 probability of surviving 17 times the mean time to local permanent fault arrival. Such a system would be 0.99 probable to survive 100 years with one fault every 6 years.

Very cool.

Related: Evolutionary DesignInvention MachineEvo-Devo

How Large Quantities of Information Change Everything

Scale: How Large Quantities of Information Change Everything

There’s another important downside to scale. When we look at large quantities of information, what we’re really doing is searching for patterns. And being the kind of creatures that we are, and given the nature of the laws of probability, we are going to find patterns. Distinguishing between a real legitimate pattern, and something random that just happens to look like a pattern can be somewhere between difficult and impossible. Using things like Bayesian methods to screen out the false positives can help, but scale means that scientists need to learn new methods – both the new ways of doing things that they couldn’t do before, and the new ways of recognizing when they’ve screwed up.

There’s the nature of scale. Tasks that were once simple have become hard or even impossible, because they can’t be done at scale. Tasks that were once impossible have become easy because scale makes them possible. Scale changes everything.

I discussed related ideas on my Curious Cat Management Improvement blog recently: Does the Data Deluge Make the Scientific Method Obsolete?

Related: Seeing Patterns Where None ExistsMistakes in Experimental Design and InterpretationOptical Illusions and Other Tricks on the BrainData Based Decision Making at Google

von Neumann Architecture and Bottleneck

We each use computers a great deal (like to write this blog and read this blog) but often have little understanding of how a computer actually works. This post gives some details on the inner workings of your computer.
What Your Computer Does While You Wait

People refer to the bottleneck between CPU and memory as the von Neumann bottleneck. Now, the front side bus bandwidth, ~10GB/s, actually looks decent. At that rate, you could read all of 8GB of system memory in less than one second or read 100 bytes in 10ns. Sadly this throughput is a theoretical maximum (unlike most others in the diagram) and cannot be achieved due to delays in the main RAM circuitry.

Sadly the southbridge hosts some truly sluggish performers, for even main memory is blazing fast compared to hard drives. Keeping with the office analogy, waiting for a hard drive seek is like leaving the building to roam the earth for one year and three months. This is why so many workloads are dominated by disk I/O and why database performance can drive off a cliff once the in-memory buffers are exhausted. It is also why plentiful RAM (for buffering) and fast hard drives are so important for overall system performance.

Related: Free Harvard Online Course (MP3s) Understanding Computers and the InternetHow Computers Boot UpThe von Neumann Architecture of Computer SystemsFive Scientists Who Made the Modern World (including John von Neumann)

  • Recent Comments:

    • Jason Monroe: Many of my friends do Crossfit and realize how quickly you lose weight when you increase your...
    • Denise Gabbard: Nice! This is the kind of thing we should all embrace. Not only are they helping the planet...
    • Huskar: Thanks your explanation.
    • Mark: Good point, my explanation is as follows. If someones got a better one I’d like to hear it....
    • Asad Wahab: I was just wondering if he can is round in shape then how come the electrons are shifted to one...
    • Sonia Bourke: That’s amazing – such a beautiful animal. I’ve always wondered how a...
    • Anonymous: Hi, Thanks for your nice article. I think India can overtake the China, because engineering...
    • Mark: We just bought one the other day at a plant sale, and it has just begun to flower. I didn’t...
  • Recent Trackbacks:

  • Links