in and exists

David Clement

September 2004

Oracle versions 7-8


The SQL operators in and exists are often confused, but they are processed quite differently in Oracle. The purpose of exists is to determine if an element occurs in a set; the purpose of in is to examine a set related to the containing query. A good rule of thumb is that exists is almost always the operator to use simply to check if a row exists in a table.

A example from Oracle's "Ask Tom" column gives a logically equivalent process for each of two queries that return the same results, one of which uses in while the other uses exists.

First,

          select * from t1 where x in (select y from t2);
								
is equivalent to
          select * from t1, (select distinct y from t2) t2
          where t1.x = t2.y;
        
In this case, the inoperator returns a processed set that can be inner-joined to the containing query. This is fast if t2 is smaller than t1, if the sort (unique) operation on t2 implied by the distinct keyword is not time-consuming, and if t1.x has an index.

Second,

          select * from t1 where exists 
          (select null from t2 where y = t1.x);
								
is equivalent to
          for x in (select * from t1) loop
              if (select null from t2 where y = x.x) then
                  print x;
              else
                  continue;
              end if;
          end loop;
        
This is pseudocode because there is no other PL/SQL construct that does what exists does. In this case, for each row in the containing query, a test is made for matching rows. There is no sort (unique) operation. This is fast if t1 is smaller than t2 and if t2.y has an index, and this statement has far less overhead.

An interesting point is that select null in the above is the same as select 1 or select 'A' or any other constant, although there have been flame wars about this in Oracle forums.

The in operator is overloaded. There is a logically distinct list iteration operator of the same name that processes a set of literals or host variables. An example would be

          where x in (4, 9, 72, 3)
								
The execution plan for this statement is guaranteed to be the same as for
          where ( (x = 4) or (x = 9) or (x = 72) or (x = 3) )
								
The in operator on a list is easier to type than the nested or conditions, and is otherwise the same, so you should use it in that context.