Showing posts with label google. Show all posts
Showing posts with label google. Show all posts

2023/04/01

"Attention" and "Transformers" in Large Language Models

Everyone is talking about OpenAI's ChatGPT these days. Here's a very quick attempt to summarize the core idea behind large language models (LLMs) like GPT.

"Attention is all you need" (aka the transformer paper) published in 2017 by Vaswani et al from Google is still the mother of current LLMs, including GPT.  "Effective Approaches to Attention-based Neural Machine Translation", an earlier paper by Luong et al from Stanford, was also quite important.

These are sequence-to-sequence models, i.e. their job is mapping an input sequence of text into an output sequence of text. Applications include translation from one language to another, answering questions,  having a conversation, etc.

They use language embeddings (made famous by Word2vec in 2013 and later by BERT, both also from Google) as the basic encoding/decoding building blocks, i.e. mapping text to vectors of real numbers in an "embedding space".

The main new idea is in the architecture of the neural network between the input encoding and output decoding stages. The model uses the preceding terms in the current output sequence to decide which parts of the input sequence to pay more "attention" to for the next output term. A bit more precisely: the previous output is a "query" which gets used to generate a linear combination of "keys" from the input which maps to a linear combination of "values" also from the input. That in turn gets transformed into the next output term with a few more layers in a plain feed forward network (i.e. a bunch of layers of neurons, where each neuron is putting a linear combination of inputs into non-linear activation function). Each step has trainable weights.

There are also clever tricks besides "attention". One is positional encoding to represent the order so the same input term in a different position has different effects even though, unlike in recurrent neural networks,  in transformers the network just sees them as bag of words that could be in any order. Another is layer normalization to sort of keep the nonlinear outputs within a reasonable area in the embedding vector space.

This architecture, as far as I know, was not derived explicitly from the way human brains work. The "attention" analogy is really useful, but there are no principles saying this architecture is more fundamental to intelligence, or more natural, than many others. It just happens to produce remarkably good results when the weights are trained properly.

So that's the basic idea of contemporary LLMs. Of course in some sense, all computer neural networks are  just a bunch of matrix multiplications and ad-hoc activation functions. But you can't just connect a large number of mathematical "neurons" randomly in a network and hope it learns something.  The choice of architecture, i.e. how the "neurons" are connected, is key. On top of that, there is still an enormous amount of innovation/engineering to make the real world language models, not to mention turn them into a product like ChatGPT or Google Bard.

2017/05/08

The message is the medium

I've started posting stuff over on medium.com/@nemozen

After many years on Blogger, I finally got tired of trying  to write a posts on the Blogger app and failing, I gave up.  Medium, both the app and the website, seem to be really author-friendly.

2015/05/04

Update on Ethiopic transliteration in Gmail, Google Docs, Blogger, etc.

ሰላም ዓለም!

Transliteration is is the conversion of a text from one script to another. For example,  typing  something in the Roman alphabet, like "selam alem", and having it show up in fidel (Ethiopic script) as ሰላም ዓለም.  This is a really convenient way for people who want to write in languages that use non-Roman scripts, to write on an ordinary computer which has a keyboard with roman letters.  

A few years ago (it's hard to believe it's already been that long!)  Ethiopic transliteration in Gmail, Google Docs, etc. was launched.   With several user interface changes in Google products, the instructions ain that post are a bit out of date. Here's a quick update so there's an easy reference somewhere.

Transliteration as a standalone tool

To use it as a standalone tool, where you can just type text to copy elsewhere, go to google.com/intl/am/inputtools/try/ or google.com/transliterate/amharic (similarly for tigrinya

How to write Amharic or Tigrinya in Gmail, and in Google Docs:

To use Amharic transliteration directly while writing inside Gmail, Docs, Blogger, Sites, etc. you need to set it as an "input method" (all of the following works with Tigrinya as well as Amharic.  Just select Tigrinya instead in the settings).
  1. Go to Gmail Settings (the little gear icon in the top right corner of the Gmail window) and click on Settings.
  2. In settings, under the "General" tab, in the "Language" section, click on "Show all language options" and then click on the checkbox to "Enable input tools".
  3. Click on the "Edit tools" link right next to it. A large window will pop-up with various languages on the left under Input Tools, select Amharic and move it over to the right column under "Selected input tools" using the big arrow, in the middle of the window, and click OK
  4. Save the settings: back on the settings page, make sure you scroll all the way down and click on Save
  5. Once you have done that, you will see the አ icon right next to the gear icon in the top right in Gmail. Just click on that icon whenever you want to switch to typing in Amharic 
  6. Then when you type phonetically in roman letters, and as you finish each word, the corresponding text in Ethiopic shows up.  
  7. You don't have to memorize any rules, just type naturally the words as they sound, and it will figure out the best transliteration. For example "negergn" becomes ነገርግን but "negeregn" becomes ነገረኝ.  Notice that "gn" gives different results in the two cases. The transliteration shows up as you type, showing multiple candidates, and when you hit space at the end of the word, the top one is automatically chosen.  You can also select the several other choices if the top one is not what you mean.
  8. This also works in Google Docs, Blogger,  Google Sites and most Google products that have text input.

P.S. Ethiopic font

Note that before you can use transliteration, your computer must have an Ethiopic (Ge'ez) font installed. Most recent versions of Windows or Mac have it pre-installed so you can skip this part. If you can see the following text "ሰላም ዓለም" (or read the text on this web page, then you have an Ethiopic font installed. If you can't see it, then you need to install a font like this one for example.

2010/03/20

Zen for my domain

After procrastinating about it for years I've finally switched this blog to be on my personal domain name (in case you're curious, here's how).. so adios nemozen.blogspot.com and hello nemozen.semret.org!
How long did I procrastinate? Suffice it to say, I've had that domain name since before blogger/blogspot or even the term "blogging" existed... Now maybe in another 3-4 years I will customize the templates! In the meantime, I hope this doesn't break the feedopology.

2009/01/22

Nemo, Zen and the art of 20%



One of the cool things about my employer is the concept of 20% time. Basically it's a license to spend a chunk of your time working on things that you think are good, useful interesting, etc., but otherwise might not get done. Recently I finished a small 20% project, and it was officially announced today. Read all about it, and then go and try it!

Note: If you can't see Ethiopic fonts on your computer, here are some links to Ethiopic fonts to download and install.

2007/05/07

As of May 7, 2007,

for the record, I am no longer writing about certain topics and a certain company in particular. Regular readers (hey me) have noted it already!

2007/04/12

Google, Youtube, Mark Cuban

Mark Cuban makes a great point: that "Gootube" (as he calls Youtube now that it is owned by Google) is forced to be a free bandwidth video hosting/delivery service, rather than a media company, because due to DMCA they can't monetize the content by putting ads around it. How come nobody else is talking about that fact? That's much more fundamental than whatever content deals or lawsuits they are involved in this week.

Still, I think Cuban is too pessimistic about Gootube. It's not like there's no hope for them, quite the contrary. Consider this: instead of monetizing the content, they can enable the monetization, with two features for the content uploader

1) pay-per-view powered by Google Checkout payment services,

2) ads powered by Google Adsense or whatever video ad solution they have, and again the content owner gets the revenue directly

In both cases, the content owner gets the revenue directly and Gootube makes money on fees, like Ebay. These fees can be just as high as the profits from licensing content and monetizing it themselves like a media company (higher if you believe the "long tail" content has more value than big media properties).

Plus they can truly claim to be just the service provider and therefore not liable for copyright violations.

Now throw in a third feature

3) "Gootube Pro" service with no 10-minute limit, where they charge the content owner for bandwidth, a for-pay mass market video content hosting/delivery service, and they can have all the "big" content too.

Why would they try to beat the media companies at their own game when they they have the infrastructure and technology to play a unique and extremely lucrative role as an enabling platform for mass Internet video, for both user-generated content and mainstream media properties?

When everyone thought search technology was just a commodity piece of a portal, Google succeeded by technological strengths in search. Now with everyone thinking about social media and big media monetization of content, maybe they will go with their strengths as a distributed platform for hosting and delivery of content, payment and advertising services.

2006/12/19

Google and wisdom of crowds

Google is Putting crowd wisdom to work. Seems like all the great ideas take about 10 years longer than expected. Of course there was Admiral Ponidexter's terrorism prediction market which was suggested after Sept 11... I first heard if as "Idea Futures", way back in the mid-90s, which someone was goingn to commercialize, but somehow it never worked out. Oh wait ... Foresight Exchange

Google is taking all these good ideas lying around. I guess that's the real secret. Innovation is not like a gold rush, the ideas are there in plain sight. You just have to pick them up one by one, when you are ready, and polish them up. Of course it helps if you are very smart, and have an unbelievable cash flow because you were disciplined about your first idea.

So the question now is, why can't Google use the "wisdom of crowds" inherent in the search engine? Of course selling search ads is a form of that. But I mean in the predictive sense. To take a very crude example, if you know the key words searched by people from a particular company, then that could tell you something about what that company is doing. Not in any direct way of course like: the CEO searched for "bankrupcy law". But in a massive way with tiny correlations being detected in mountains of data.... Oh wait how do we know they are not already doing it?