Why is a Variable declared as NVARCHAR(MAX) dropping chunks of the string?

For whatever reason, a query is being built as a string and passed off to be executed by another stored procedure.

The query is massive.

Over a thousand lines, and we've run into an issue that requires me to debug it.

The query is being built into a declared NVARCHAR(MAX) variable, but something odd is happening when I print it off using the following -

WHILE @Printed < @ToPrint BEGIN 
    PRINT(SUBSTRING(
        @sql, @Printed, 4000))
    SET @Printed = @Printed + 4000
    PRINT('Printed: ' + CONVERT(VARCHAR, @Printed))
END

At a certain place in the printed message, it's just... dropping a chunk, and I don't understand why. NVARCHAR(MAX) should be able to hold War and Peace over 100 times, and this query is NOT War and Peace.

I know PRINT(...) has a limitation of only being able to print off 4000 characters at a time (hence the loop), but that doesn't explain why the @sql variable is just losing a chunk in places.

If it helps, specifically, the place where the chunk is dropping is about 1,600 characters after the first 4,000 characters are printed.

Why is it doing this? Am I missing setting a system variable at the start of the query (like NOCOUNT or ARITHABORT? I don't even know what those do, or if they're even involved.


EDIT : MCVE : Here. To reproduce, copy-paste into Microsoft SQL Server Management Studio and hit 'F5'. The message printed will not include @sql in its entirety.


This is working fine for me:

DECLARE @sql nvarchar(max) = 
    REPLICATE(CONVERT(nvarchar(max), N'a'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'b'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'c'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'd'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'e'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'f'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'g'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'h'), 4000)
  + REPLICATE(CONVERT(nvarchar(max), N'i'), 4000);


PRINT LEN(@sql);  -- characters
PRINT DATALENGTH(@sql); -- bytes
PRINT '';

DECLARE @Printed int = 1, @ToPrint int = LEN(@sql);

WHILE @Printed < @ToPrint BEGIN 
    PRINT(SUBSTRING(
        @sql, @Printed, 4000))
    SET @Printed = @Printed + 4000
    PRINT('Printed: ' + CONVERT(varchar(11), @Printed)) -- *
END

* Always specify length.

Output is:

36000
72000

aaaaaaaaaa... 4000 As ...aaa
Printed: 4001
bbbbbbbbbb... 4000 Bs ...bbb
Printed: 8001
cccccccccc... 4000 Cs ...ccc
Printed: 12001
dddddddddd... 4000 Ds ...ddd
Printed: 16001
eeeeeeeeee... 4000 Es ...eee
Printed: 20001
ffffffffff... 4000 Cs ...fff
Printed: 24001
gggggggggg... 4000 As ...ggg
Printed: 28001
hhhhhhhhhh... 4000 Bs ...hhh
Printed: 32001
iiiiiiiiii... 4000 Cs ...iii
Printed: 36001

So, I think the problem is elsewhere. In any case, this is a really sloppy way to validate the contents of dynamic SQL. Instead I would do:

SELECT CONVERT(xml, @sql);

Then you can click on the output cell and it opens in an XML text editor for review (you can then copy and paste that output into a query window if you want IntelliSense or any chance in executing, but you'll have to replace encoded characters like &gt; --> >. I talk about this approach (and another one) here:

  • Validate the contents of large dynamic SQL strings

If you insist on doing it this bricklaying way, perhaps there is some kind of non-printing or string-termination character that's at that point. If you say it is around character 5,600 then you could do:

DECLARE @i int = 5550, @c nchar(1);
WHILE @i <= 5650
BEGIN
  PRINT '';
  SET @c = SUBSTRING(@sql, @i, 1);
  PRINT '------   ' + RTRIM(@i) + '------:';
  PRINT 'Raw:     ' + @c;
  PRINT 'ASCII:   ' + ASCII(@c);
  PRINT 'UNICODE: ' + UNICODE(@c);
  SET @i += 1;
END

You should be able to scan down and match the last sequence of characters you see in the broken print output. Then look for anything where the Raw: line is empty and the ASCII: line is anything other than typical (9, 10, 13, 32).

But I don't think this is the problem. I'll go back to an earlier comment where I suggested that the string itself is the problem. In the question, you mention @sql, but don't show how it's populated. I would bet that some string you're adding to that is getting truncated. Some things to look out for:

  • Intermediate variables/parameters declared as varchar/nvarchar but with no length (which sometimes leads to silent truncation at 1 character, and sometimes 30):

      DECLARE @sql nvarchar(max) = N'SELECT * FROM dbo.table ';
      DECLARE @where nvarchar = N'WHERE some condition...';
      SET @sql += @where;
      PRINT @sql;
    

    Output:

      SELECT * FROM dbo.table W
    
  • Intermediate variables/parameters declared as varchar/nvarchar but too short (which leads to silent truncation at whatever the declaration is):

      DECLARE @sql nvarchar(max) = N'SELECT * FROM dbo.table ';
      DECLARE @where nvarchar(10) = N'WHERE some condition...';
      SET @sql += @where;
      PRINT @sql;
    

    Output:

      SELECT * FROM dbo.table WHERE some
    
  • Explicit CONCAT with NULL, which leads to silently dropping any NULL input):

      DECLARE @sql nvarchar(max) = N'SELECT * FROM dbo.table ';
      DECLARE @where nvarchar(32);
      DECLARE @orderby nvarchar(32) = N' ORDER BY col1';
      SET @sql = CONCAT(@sql, @where, @orderby);
      PRINT @sql;
    

    Output:

      SELECT * FROM dbo.table  ORDER BY col1
    
  • Not using the N prefix when concatenating Unicode string literals > 4000 characters (example here):

      DECLARE @sql nvarchar(max) = '';
    
      SET @sql = @sql + '... literally 4001 characters ...';
    

    The output here (as shown in the example) will be truncated at 4,000 characters. However if you define your strings properly, this won't happen:

      DECLARE @sql nvarchar(max) = N'';
    
      SET @sql = @sql + N'... literally 4001 characters ...';
    

These things can be hard to spot in overly complex dynamic SQL generation, so it's never a bad idea to simplify and try any way you can to divide & conquer the major components in the eventual string. Based on the repro you attempted I would almost certainly guess it is the "variable declared too short" symptom. Safest is to ensure every input to a dynamic SQL string should be declared as nvarchar(max); no real good reason to use anything else except for entity names which are constrained by metadata anyway.