Counting Lines of Text




Counting Lines of Text

Problem

You need to count lines of text within a string or within a file.

Solution

Use the LineCount method shown in Figure to read in the entire file and count the number of line feeds.

LineCount method

using System;
using System.Text.RegularExpressions;
using System.IO;

public static long LineCount(string source, bool isFileName)
{
    if (source != null)
    {
        string text = source;

        if (isFileName)
        {
            using (FileStream FS = new FileStream(source, FileMode.Open,
                                         FileAccess.Read, FileShare.Read))
            {
                using (StreamReader SR = new StreamReader(FS))
                {

                    text = SR.ReadToEnd( );
                }
            }
        }

        Regex RE = new Regex("\n", RegexOptions.Multiline);
        MatchCollection theMatches = RE.Matches(text);

        if (isFileName)
        {

            return (theMatches.Count);
        }
        else
        {

            return (theMatches.Count) + 1;
       }
    }
    else
    {

        // Handle a null source here.
        return (0);
    }
}

LineCount2, a better performing alternate version of this method uses the StreamReader.ReadLine method to count lines in a file and a regular expression to count lines in a string, as shown in Figure.

LineCount2 method

public static long LineCount2(string source, bool isFileName)
{
    if (source != null)
    {
        string text = source;
        long numOfLines = 0;

        if (isFileName)
        {
            using (FileStream FS = new FileStream(source, FileMode.Open,
                                         FileAccess.Read, FileShare.Read))
            {

                using (StreamReader SR = new StreamReader(FS))
                {

                    while (text != null)
                    {
                        text = SR.ReadLine( );

                        if (text != null)
                        {
                            ++numOfLines;
                        }
                    }
                }
            }

            return (numOfLines);
            
        }
        else
        {

            Regex RE = new Regex("\n", RegexOptions.Multiline);
            MatchCollection theMatches = RE.Matches(text);

            return (theMatches.Count + 1);
        }
    }
    else
    {

        // Handle a null source here.
        return (0);
    }
}

The following method counts the lines within a specified text file and a specified string:

	public static void TestLineCount( )
	{
	    // Count the lines within the file TestFile.txt.
	    LineCount(@"C:\TestFile.txt", true);

	    // Count the lines within a string.
	    // Notice that the \r\n characters start a new line
	    // as well as just the \n character.
	    LineCount("Line1\r\nLine2\r\nLine3\nLine4", false);

	}

Discussion

Every line ends with a special character. For Windows files, the line-terminating characters are a carriage return followed by a line-feed. This sequence of characters is described by the regular expression pattern \r\n. Unix files terminate their lines with just the line-feed character (\n). The regular expression "\n" is the lowest common denominator for both sets of line-terminating characters. Consequently, this method runs a regular expression that looks for the pattern "\n" in a string or file.

Macintosh files usually end with a carriage-return character (\r). To count the number of lines in this type of file, the regular expression should be changed to the following in the constructor of the Regex object:

	Regex RE = new Regex("\r", RegexOptions.Multiline);


Simply running this regular expression against a string returns the number of lines minus one because the last line does not have a line-terminating character. To account for this, one is added to the final count of line feeds in the string.

The LineCount method accepts two parameters. The first is a string that either contains the actual text that will have its lines counted or the path and name of a text file whose lines are to be counted. The second parameter, isFileName, determines whether the first parameter (source) is a string or a file path. If this parameter is TRue, the source parameter is a file path; otherwise, it is simply a string.

See Also

See the ".NET Framework Regular Expressions," "FileStream Class," and "Stream-Reader Class" topics in the MSDN documentation.