## Generalized Nonaveraging Integer Sequences January 9, 2015

Posted by Dennis in Uncategorized.

Everybody who participated in RSI 2009 from this blog have now just finished their undergraduate degrees. I think it’d be interesting to have an update to see how everybody is doing.

I think in the future, I’ll add posts here either to 1) explain intuitively what happens in a paper of mine or 2) write the proof or idea of something that is hard for me to find in the literature. The audience for 1) is going to be really small, considering the lack of papers and people who read them (but I feel sorry for them), and 2) would be to help myself learn.

This time, I’ll try to outline intuitively what went on in my RSI project 5 years ago, the reason being that the main ideas (I think) are pretty well-disguised behind the technical details, which are really just a bunch of casework. The paper is here: http://arxiv.org/pdf/1107.1756.pdf.

As for motivation, we recall that there is the open problem of trying to find the largest subset $S$ of $\{0,1,\ldots,N\}$ that contains no three term arithmetic progression. Most of the research has been focused on finding asymptotics for the size of $S$ given large $N$. I won’t claim enough familiarity with the current literature to give the best current bounds, but I dimly recall there was a lower bound constructed by Elkin and an upper bound by Bourgain five years ago, and that they were the best then.

Disregarding the current literature, the most naive way to approach this problem is to use the greedy algorithm. Namely, you start with $S$ empty, and then repeatedly add the smallest number possible that does not violate the condition that there is no 3-term arithmetic progression.

It turns out that the greedy algorithm generates a sequence $S$ that contains precisely the integers that have only 0 or 1’s as digits when expanded in base 3. This extremely explicit solution is nice, but we can also easily tell that it is horrible asymptotically. For example, the greedy algorithm gives $|S|=\Omega(n^{\log_3(2)})$, while Elkin’s construction alluded to above gives $|S| = \Omega(\frac{n}{2^{2\sqrt{2}\sqrt{\log_2(n)}}}\log^{1/4}(n))$.

The condition that the sequence has no three term arithmetic progression is equivalent to avoiding (nontrivial) solutions to $x+y=2z$, where nontrivial here means ruling out $x=y=z$. We would like to generalize this. For example, if you generate the greedy sequence that contains no solution to $x+y+z=3w$, where $x,y,z,w$ are all different, you get numbers of the form $3M + R$ where $M$ is a number whose base 4 representation has only 0’s and 1’s, and $R\in \{0,1,2,3,4\}$. This is a result due to Layman, in a short paper in the journal of integer sequences.

What happens if we consider avoiding solutions to
\begin{aligned} a_0x_0+\cdots+a_nx_n = (a_0+\cdots+a_n)x_{n+1} \end{aligned}
where all the $x_i$‘s are different? (There is also the related (I think slightly easier but still hard) problem where you instead consider where not all the $x_i$‘s are the same, but let’s focus on this case for now.)

In general, I think we don’t understand this at all (or at least we didn’t understand it 5 years ago). From generating these sequences with a computer, it seems like, if you have a bunch of terms that are close together, then that forces the next terms in the sequence to be farther apart. Similarly, if a bunch of terms are far apart, then the next terms in the sequence are closed together. This makes complete sense heuristically.

I wish I still had the actual graphs to give an idea, but the graph of the sequence would typically be flat at the beginning, suddenly grow because of the terms bunched together. Then, since the terms are spread out, it would suddenly become flat again. This would repeat. Instead of simply cycling, the times where the graph is either flat or growing really fast increases with each iteration. I think they increase geometrically, but it was hard to compute enough terms to see.

In some very special cases, this pattern is so extreme that the part where the graph is growing really fast is instead a perfect jump. This is because the terms in the beginning are so bunched together that there are simply no gaps to stick any more terms. Consider, for example, the graph of the sequence that avoids $x_1+x_2+x_3+x_4=4x_5$:

My RSI paper was just to try to find these some of these sequences and their closed forms (for example the one above). The idea is the same as Layman’s paper, where the closed form is $cM+R$, where $c$ is some number, $R$ is some set of initial terms, and $M$ are the integers with 0’s and 1’s in base $d$ representation, where $d=a_0+\cdots+a_n+1$. Basically, the elements in $R$ prevent anything else less than $c$ from being in the sequence, and $c$ is so large that you can treat $M$ and $R$ independently when plugging into the equation $a_0x_0+\cdots+a_nx_n=(a_0+\cdots+a_n)x_{n+1}$. As you can tell, this only happens in extremely special cases, but, in particular, it encompasses the case where all the $a_i$‘s are 1.

In general, I think it’s interesting to try to find the growth rate of these sequences, but my guess is that it is difficult. In fact, I didn’t found anything in the literature that beats the silly counting argument (other than heuristic arguments) during RSI in terms of an upper bound. It’s messy, yet not random.