Tuesday, 5 July 2016

Backing Up Files To Cloud Storage

I have an application that backs up files to cloud storage such as OneDrive. Manually it is easy to do this on a PC using Windows Explorer. Just copy and paste files of interest to the local OneDrive folder. How could I automate this? If I just wanted to back up files in a fairly inefficient manner I could write a .NET console application that does simple file copy operations on folders of interest.

But, unlike for my local backups, I didn’t necessarily want all files to be readable. I found a free encryption application that was also programmable from C#. However, this is restricted to encrypting folders not files. It is easy to get around this. Programmatically zip up the folder and encrypt the zip file instead.

Having done that, then programmatically copy the encrypted zip to the OneDrive folder. I can then use Windows Task Scheduler to run the application at regular intervals.

Local Backup

 

I currently have three backups scheduled. One of them is a differential backup using SyncToy. So it detects the changes since the last backup and just does those. So far my cloud backup  backs up everything every time. Not very efficient. But also, as I’m backing up over the internet,  it’s unnecessarily eating into my data allowance.

Comparing Zip Files

 

I found a tool, ZipDiff, that compares zip files looking for differences. For each zipped folder I can run this and then only backup when something has changed. I might still have a big backup as each zip file can itself be quite big but it’s better than unnecessarily backing up several zipped files when nothing has changed.

Parallel Operation

 

Roughly speaking, for each folder, I need to
  1. Zip
  2. Encrypt (optionally)
  3. Backup
This is easily parallelisable (embarrassingly parallel, as they say). So I can use a parallel for loop. Handling errors requires some care though. One scenario is that certain types of file cause the zip operation to fail if the file is in use. Microsoft Word document is one such type. However, I wanted the algorithm to continue processing other folders in such cases  instead of terminating. This requires a loop that looks like below.
try
{
    BackupEncryptedToOneDrive(sourceFolderPathsForEncryption);
}
catch (AggregateException ae)
{
    LogAggregateErrors(ae);
}
private static void BackupEncryptedToOneDrive(IEnumerable<string> sourceFolderPathsForEncryption)
{
    Console.WriteLine(LogMessageParts.FoldersForEncryption);
    Logger.Info(LogMessageParts.FoldersForEncryption);
    Console.WriteLine(Environment.NewLine);

    var exceptions = new ConcurrentQueue<Exception>();

    Parallel.ForEach(sourceFolderPathsForEncryption, path =>
    {
        try
        {
            Console.WriteLine(LogMessageParts.Processing, path);
            Logger.Info(LogMessageParts.Processing, path);

            if (TryCreateZip(path))
            {
                Encrypt(path);
                BackupToOneDrive(path);
            }
            else
            {
                string noChangesDetected = string.Format("No changes detected in {0}...", path);
                Console.WriteLine(noChangesDetected);
                Logger.Info(noChangesDetected);
            }
        }
        catch (Exception ex)
        {
            exceptions.Enqueue(ex);
        }
    });

    Console.WriteLine(Environment.NewLine);

    if (exceptions.Any())
        throw new AggregateException(exceptions);
}

private static void LogAggregateErrors(AggregateException ae)
{
    ae = ae.Flatten(); // flatten tree to process exceptions at the leaves
    foreach (var ex in ae.InnerExceptions) LogError(ex);
}

The idea here is that we queue up the exceptions from each parallel iteration, wrap them up in an AggregateException and then unwrap and log them at the top level. So a failure in one parallel iteration still allows the others to run to completion.

Thursday, 29 October 2015

Exploring Akka.NET for Concurrency and Distributed Computing

Akka.NET is described as “a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on .NET & Mono.” It is a port of the Akka framework for the JVM written in Scala. Its initial release was in April 2015, not long after Microsoft’s similar cloud-oriented Project Orleans (February 2015). Orleans is described as “a framework that provides a straightforward approach to building distributed high-scale computing applications, without the need to learn and apply complex concurrency or other scaling patterns.”

Each of these frameworks is based on the Actor Model of concurrency of which more later.

Background

I first heard of Akka via a polyglot developer colleague who has extensive experience of both Java and .NET. He happened to get into some Scala development and was fortunate enough to get some experience with Akka. Later on I started encountering various references to .NET Actor frameworks/libraries, almost all in their very early stages. In February 2014 I came across a link to Roger Johansson’s Pigeon project in Github that later became Akka.NET. A year later via my F# Weekly feed I saw that Akka.NET was in beta, so I browsed to the site and was amazed at how much information was there. There was also a Visual Studio Nuget package that I tried and it “just worked,” no faffing around with configuration. That’s not always the case with open source projects. Then a few weeks after that it reached 1.0.

The Actor Model of Concurrency

The Actor Model in computer science is “a mathematical model of concurrent computation that treats ‘actors’ as the universal primitives of concurrent computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.”

The Actor Model was invented by Carl Hewitt in 1973 and you can find him explaining the basic ideas at Microsoft’s Channel 9. This is also available on YouTube should you wish to view it there.

“According to Carl Hewitt, unlike previous models of computation, the Actor model was inspired by physics, including general relativity and quantum mechanics.”

Wow! But don't worry. You don't need to understand general relativity and quantum mechanics to get started!

One way of thinking about the Actor Model is by analogy to garbage collection or other automated memory management schemes. You can view garbage collection as providing a high-level abstraction over manual memory management. Similarly you can view the Actor Model as providing a high-level abstraction over manual thread management and synchronization. The reason why the Actor Model is attracting a lot of attention now is due to the rise of multiple processors and multi-cores combined with the growth of the internet and highly distributed computing. Actor-based frameworks such as Akka and Orleans are more easily able to handle these scenarios, freeing the developer to concentrate on solving business problems rather than getting bogged down in “low-level” concurrency issues.

Akka.NET

Akka.NET provides an actor system that the user typically arranges into a hierarchy (tree) of actors that communicate with each other via immutable messages. Actors supervise the actors directly below them in the tree and are responsible for handling their failures. When an actor crashes, its parent can either restart or stop it, or escalate the failure up the hierarchy of actors. It is this that enables “self-healing” – fault tolerance and resilience.

Each actor has its own state that is not shared with other actors. Actors send messages to other actors asynchronously so that they don’t block. Actors process received messages one at a time. They can also determine how to respond to the next message received. This is called switchable behaviour. Supervision and switchable behaviours are two of the “killer” features of the Actor Model.

Well, that’s the basic idea. There are a lot more features available but I hope this gives you a flavour. Apart from the Akka.NET site you can also find some excellent, well-written blog posts by Petabridge. (one of the creators of the framework). They also provide a free online Bootcamp. If you have a subscription to Pluralsight then, at the time of writing, there are four excellent courses on Akka.NET.

 

Monday, 16 February 2015

JavaScript Server-Side Logging with JSNlog

Web applications have become increasingly JavaScript-heavy in recent years as we’ve moved to richer and much more responsive web applications. It’s fine debugging JavaScript errors in the browser during development but what about in deployed applications? JSNlog is an open source framework that enables this and can be used in combination with standard .NET logging frameworks such as NLog, log4Net and Elmah. Below I show an example of how to use it with NLog.

Installing NLog

NLog has an installer that’s worth running once, as it supplies some Visual Studio item templates and a code snippet for declaring a logger instance.

private static NLog.Logger logger = NLog.LogManager.GetCurrentClassLogger();

But it’s not essential. You can install it via NuGet. You will need to run both of these commands.

Install-Package NLog



Install-Package NLog.Config



The latter adds a config file (NLog.Config). This is where you declare your log files and logging rules.  For example



<targets>
<!--
add your targets here -->
<
target name="logfile" xsi:type="File" fileName="${basedir}/file.txt" />
</targets>
 
<rules>
<!--
add your logging rules here -->
<
logger name="*" minlevel="Info" writeTo="logfile" />
</rules>


Logging a Message From NLog



Suppose we have a ASP.NET MVC application. After setting up the above in the Home controller edit it like this.



using NLog;

namespace WebApplicationNLog2.Controllers
{
public class HomeController : Controller
{
private static Logger logger = LogManager.GetCurrentClassLogger();

public ActionResult Index()
{
logger.Info(
"Sample trace message");
return View();
}
}


 



Then a message is written to the file file.txt in the project folder. It will look something like this.



2015-02-13 12:32:22.5442|INFO|WebApplicationNLog2.Controllers.HomeController|Sample trace message



Installing JSNlog



There is a specific NuGet package to go with the logging framework we happen to be using. So for this example it is:



Install-Package JSNLog.NLog



This installs the dependent package JSNlog among others and also updates the Web.Config as required.



Logging JavaScript



Let’s place some arbitrary JavaScript in the Home controller’s Index view.



First we need to configure JSNlog by placing this line before any script tag that uses JSNlog.



@Html.Raw(JSNLog.JavascriptLogging.Configure())


In a real application we would most likely place this in _Layout.cshtml. Now we can start logging.



<script type="text/javascript">
JL().info("This is a log message");
</script>

Then a message is written to the file file.txt in the project folder. It will look something like this.

2015-02-16 11:27:55.7520|INFO|ClientRoot|This is a log message


All of the logging levels and layout rules that are configurable in frameworks such as NLog and log4net are carried over to the logging of JavaScript in the same way.

Thursday, 9 February 2012

Web Browser Process Statistics Using Windows PowerShell

I use a number of web browsers on my Windows PC. One of them is Google Chrome, which I have been using from not long after its initial release. From Wikipedia: “A multi-process architecture is implemented in Chrome where, by default, a separate process is allocated to each site instance and plugin.” This makes it awkward to work out its memory consumption. It is in fact possible to obtain this information from Chrome itself, though I only discovered that quite recently. Chrome has its own task manager with which you can report such statistics. Tools –> Task Manager –> Stats for nerds displays the results in a tab called About Memory. It also reports stats for other running browsers. Here are some stats from the top of the About Memory tab:

image

Notice that here it only reports the usage for the Chrome processes minus plugins and extensions. To get the total figure you need to view the figure at the bottom.

image

Windows PowerShell is also able to calculate the total memory consumption by summing up all the processes named Chrome:

$p = (Get-Process Chrome | Measure-Object -Sum WorkingSet).Sum / 1024

Write-Host "Total = "$p" K"

This produces a similar result (consumption fluctuates from moment to moment):

Total =  621508 K

Chrome’s About Memory also produces stats for Firefox. However, once again this excludes plugins and extensions. So Chrome doesn’t help us out here. We can write similar code for Firefox but this time we also need to include another process called plugin-container of which there may be zero or more depending on whether the current Firefox instance has had to start one up or not (i.e., whether user has happened to run Flash or a PDF reader). The code for this is slightly more involved:

$f = (Get-Process Firefox | Measure-Object -Sum WorkingSet).Sum / 1024

Write-Host "Firefox Total = "$f" K"

$p = (Get-Process "plugin-container" -ErrorAction SilentlyContinue | Measure-Object -Sum WorkingSet).Sum / 1024

Write-Host "Plugin Container Total = "$p" K"

$c = $f + $p

Write-Host "Combined Total = "$c" K"

The first part is the same except for substituting Firefox for Chrome. Then we define another variable for summing up the plugin-container processes. Adding the two variables together gives us the total consumption.

Firefox Total =  556116 K
Plugin Container Total =  45536 K
Combined Total =  601652 K

But notice there’s some extra code we had to use:

-ErrorAction SilentlyContinue

This is required because if there are no active plugin-container processes PowerShell will report an error. The SilentlyContinue argument does what it says.

The current release of PowerShell is v2.0. It is included by default in Windows 7 and Windows Server 2008 R2. It is also available as a free download for Windows XP SP3, Vista and Servers 2003 and 2008.

Microsoft’s package manager, NuGet, for Visual Studio 2010 also makes use of PowerShell in its console window. PowerShell comes with a basic script editor supplied by Microsoft but there are more powerful IDEs out there. A good one is PowerGUI, which also has excellent IntelliSense amongst other capabilities. It also has an add-in for Visual Studio 2010 if desired.

Monday, 23 January 2012

New Year, New Language

Functional programming languages are all the rage at the moment. They’re well-suited to parallel programming  and the multi-core world. On the Microsoft .NET platform we have F#. I’ve made one or two attempts at learning F# before but lost heart once the going got tough. This time around I’ve decided to make more of an effort. I’ve found that it helps to try more than one learning source as they differ in the degree of explanation they give for each concept.

Thus far I am consulting primarily F# Programming, Real World Functional Programming (online partial version of the book) and MSDN’s F# Language Reference.

I’ve been thinking about whether the learning-curve from procedural to object-oriented programming is greater than that from OO (or procedural) to functional.

I think the harder part about going from procedural to OO was not the mechanics but OO design perhaps. Whereas with functional I think even the mechanics are quite difficult.

However, it could be that I’ve just forgotten how difficult the procedural to OO transition was!

One initial difficulty with F#, especially for those coming from a C-syntax background, is F#’s syntax. It does look quite alien. Syntax itself should not be that big a deal but when combined with new concepts it does add to the mental load, especially once examples start to get elaborate.

A similar language on the Java JVM is Scala. Its syntax is a cross between C-syntax and Ruby/Python’s. I looked briefly at Scala some time ago and it does seem more accessible initially. Though once you get beyond the basics it becomes as scary as F#! A colleague of mine who’s been using Scala commercially for many months tells me it’s a matter of practice. Blogger Labels: Functional,.NET,Microsoft,F#,Java,Scala,Ruby,Python

Thursday, 7 July 2011

Reactive Extensions 1.0 Stable is Released

Some months ago Microsoft made Reactive Extensions (Rx) an officially supported product and moved it out of Dev Labs to its new site. On June 29th it was officially released as version 1.0.  It now also has some very accessible starter documentation in MSDN. Until now documentation has been scattered between videos, blogs, hands on labs and the MSDN Rx forum.

Rx is also consumable from LINQPad. LINQPad subscribes to the observables that you dump. For example the example below writes “Hello World” every second but stops after the first five. If we removed the call to Take(5) it would run for ever. In that case you can stop it by hitting the Stop button in LINQPad.

image

Thursday, 30 June 2011

Leveraging LINQPad’s Object Visualisation in Visual Studio

When you run a query in LINQPad it produces formatted output on the results tab that you can optionally export to Word, Excel or HTML. Here I show two techniques for visualising any arbitrary .NET object.

The first way is non-invasive, i.e., requires no change to your Visual Studio solution or compiled assemblies.

The second way is invasive but allows you to export any arbitrary .NET object to HTML by leveraging LINQPad’s Dump() extension method.

Technique 1: Add a Visual Studio LINQPad Debugger Visualizer

Download the LINQPad Visualizer from Google Code and follow the instructions from the point where it says “If using the download do this:.”

You should also remember to unblock the DLLs after downloading. Consider this simplified code with fields made public for brevity.

class User
{
public int UserId;
public List<Course> Courses;
public List<Student> Students;
}

class Course
{
public string CourseCode;
}

class Student
{
public int StudentId;
}


Set up a collection of Users.



var users = new List<User> {
new User {
UserId = 1,
Courses = new List<Course> {
new Course {
CourseCode = "ECU120"
},
new Course {
CourseCode = "ECU121"
},
new Course {
CourseCode = "ECU122"
},
},
Students = new List<Student> {
new Student {
StudentId = 1
},
new Student {
StudentId = 2
},
}
},
new User {
UserId = 2,
Courses = new List<Course> {
new Course {
CourseCode = "ECU124"
},
new Course {
CourseCode = "ECU125"
},
new Course {
CourseCode = "ECU126"
},
},
Students = new List<Student> {
new Student {
StudentId = 1
},
new Student {
StudentId = 3
},
}
},
new User {
UserId = 3,
Courses = new List<Course> {
new Course {
CourseCode = "BESU022"
},
new Course {
CourseCode = "BESU023"
},
new Course {
CourseCode = "ECT034"
},
},
Students = new List<Student> {
new Student {
StudentId = 1
},
new Student {
StudentId = 2
},
}
},
};

users.Dump();



Then we can debug into it.



image



To do this set a watch for your object of interest. Here we are inspecting the users collection. In order to view its contents we need to enter



new System.WeakReference(users)



in the Name column as shown.



Clicking on the dropdown for Value shows two LINQPad debugger visualizers. The default visualizer is a JSON visualizer that allows us to see the contained values of objects that haven’t been marked as Serializable. Clicking on the visualizer pops up a Windows Form containing the object’s values.



image



In this case we lose the type information but we can still see all the values neatly laid out.



However, if we mark the User, Student and Course classes as Serializable



[Serializable]
class User
{
public int UserId;
public List<Course> Courses;
public List<Student> Students;
}

[Serializable]
class Course
{
public string CourseCode;
}

[Serializable]
class Student
{
public int StudentId;
}



Then select the lower LINQPad visualizer we get the type information as well and with neater layout.



image



Technique 2: Add a Reference to LINQPad.exe



For this technique we add a reference to LINQPad.exe that allows us to leverage LINQPad’s Dump() extension method for any object…



/// <summary>
///
LINQPad extension methods.
/// </summary>
public static class LinqPadExtensions
{
/// <summary>
///
Writes object properties to HTML
/// and displays them in default browser.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="o"></param>
/// <param name="heading"></param>
public static void Dump<T>(
this T o,
string heading = null
)
{
string localUrl =
Path.GetTempFileName() + ".html";
using (
var writer =
LINQPad.Util.CreateXhtmlWriter(true)
)
{
if (!String.IsNullOrWhiteSpace(heading))
writer.Write(heading);
writer.Write(o);
File.WriteAllText(localUrl, writer.ToString());
}
Process.Start(localUrl);
}
}



Then we can write users.Dump() as in the example above. On execution this writes the object’s values to HTML and launches the default web browser. This produces the same results as the typed version above.



image



This method has been adapted from an example on StackOverflow. One issue to be aware of is that your project should target the full .NET 4, not the Client Profile.