C# HTML, XML

Copyright

Markup languages. HTML and XML are text-based markup languages that are commonly used to store data. They are easy to parse and share with other systems.


Concept

HTML is everywhere. For this reason, being able to process it is important. The .NET Framework has the tools necessary for basic and advanced manipulation and creation of HTML.

HTML is the publishing language of the World Wide Web. World Wide Web Consortium


Hypertext markup language: HTML

HtmlTextWriter. There are many ways to handle HTML. In this introductory program, we use the HtmlTextWriter type, which is an abstract data type for generating HTML markup.

HtmlTextWriter

Next:In this example we see that a span tag is opened and closed by the HtmlTextWriter.

C# program that writes HTML

using System;
using System.IO;
using System.Web.UI;

class Program
{
    static void Main()
    {
	using (StringWriter stringWriter = new StringWriter())
	using (HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter))
	{
	    htmlWriter.RenderBeginTag(HtmlTextWriterTag.Span);
	    htmlWriter.Write("Perls");
	    htmlWriter.RenderEndTag();
	    Console.WriteLine(stringWriter);
	}
    }
}

Output

<span>Perls</span>
P tag in HTML

Manipulate HTML. We show how to manipulate,
generate,
or remove HTML markup. You can also encode entities in HTML. A simple way to handle HTML uses regular expressions.

HtmlEncode, HtmlDecodeHttpUtilityParagraph HTML RegexRemove HTML TagsTitle From HTML

Validate:A lot of HTML is invalid. You can detect invalid HTML using the C# language. I provide an algorithm.

HTML Brackets
Program

Scrape. The term scraping means to download web pages and then scan the text and take parts into another application. You can use a C# program to scrape HTML links from web pages.

Scrape HTML
Color type

HTML colors. Many different named colors are available in the hypertext markup language. These can also be specified directly inside CSS files. We print all the HTML colors.

Color Table
Extensible markup language: XML

XML stores structured data. The .NET Framework provides rich support for XML, including classes in System.Xml and System.Xml.Linq. XML does not require a specific implementation.


Framework: NET

XElement. This program uses the System.Xml.Linq namespace to load XML files into XElement objects. There are other ways to handle XML but XElement is one of the easiest.

XElement
Example XML file

<?xml version="1.0"?>
<tag>Test</tag>

C# program that loads example XML file

using System;
using System.Xml.Linq;

class Program
{
    static void Main()
    {
	// Load the XML.
	XElement element = XElement.Load("C:\\text.xml");
	// Write text value.
	Console.WriteLine(element.Value);
    }
}

Output

Test
Squares

XML types. To continue,
we show how to write XML
and also read in XML. XmlWriter and XmlReader are fast. You could use the types in parallel to implement persistence.

XmlWriterXmlReader

XmlTextWriter:You can also find information about XmlTextWriter and XmlTextReader on this site.

XmlTextWriterXmlTextReader

Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents... The World Wide Web Consortium


C# programming language

In the real world, parsing HTML is fairly difficult because we must support badly formed markup. On the other hand, XML is easy to parse because it has strict rules for correctness.


Storing data. If you need to store data, using XML is often better than HTML. It is easier to handle in code. For optimal performance, binary data is a better option.


C#: File